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Generally it is recognised that scientifically designed clinical trials play an 
irrportant part in the development and the evaluation of medical treatments. Such 
-rials fundamentally contain natural administrative and ethical conflicts. 

In the course of this thesis we will look at the developments in the analysis of 
failure time data and deal with study of interrelationships within clinical trial data. 
The general utilisation of such analytical methods have been made possible by the 
distribution of fast computerised processing power. 

In the area of survival distributions we will consider various empirical distributions 
lt.= perform a comparative study of the non-paramtric and parametric methods and deal 
Bith the recent developments in the area of semi non-parametric methods, using the Cox's 
:r;portional hazard model. We will perform an assessment of power efficiencies of 
bests for computer simulated clinical trial data, under varying, sample sizes, censoring 
levels, significance limits, asymptotic normality and likelihood tests, time dependency 
assumptions , and a range of treatment and prognostic effect values. 

We consider interrelationships of relevance in the context of trials to be those of 
prognostic effects as well as the event time variabilities under a multivariate failure 
rime context. 

We will deal with two data sets, both of which relate to breast cancer. Initially 
we consider a data set from a clinical trial organised in Edinburgh, and study prognostic 
and treatment effects for a set of risk factors such as local recurrence, metastatic 
recurrence and death. Finally we use a data set on breast cancer patients purely 
::r the assessment of prognostic effects. In the latter study we consider a set of 
accepted prognostic effects as well as a set of measurements dealing with tumour 
rnange and extent. In the discussions of the above we present various models in 
rrder to test time periods to and from intervening events in a multivariate study, 
•e will also consider time dependency of various effects in order to check on the 
rersistance of an effect on the time scale. 
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CHAPTER 1 



INTRODUCTION 



Statistical inference has been increasingly regarded as 
a necessary tool for the assessment of risks in its various forms. 
This necessity to examine and compare risks is becoming an essential 
part of the methodology of a large number of subjects that deal with 
risk in its varied and distinct forms such as occupational hazards, 
industrial developments, environmental risks and patient management 
in hospitals. As an abstract formulation we can regard the general 
problem as that of choosing between two or more courses of action 
knowing that the courses of action have risk values attached to them. 
Part of these risks are in terms of costs and benefits to the 
individual and partly to the collective society. We can 
thus identify a set of general questions by which finding relevant 
answers for each particular context is the essential part of the 
methodology. How much information is sufficient for discriminating 
between the courses of action? What are the acceptable levels of 
benefit for introducing a new course of action? What are the 
appropriate/ 



appropriate measures of risk? What are the conflicting rights 

of the individuals and institutions, and finally, how do we collect 

the relevant information? 

The principal part of the notion of risk and its 
appraisal is introduced as soon as one considers social and human 
dimensions of a decision. In contrast, within the framework of 
most natural experiments the concept of risk does not usually 
arise and is substituted with that of deriving optimal rules 
for obtaining appropriate measures at minimum cost and time 
in collecting the relevant information. 

The methodology we are dealing with in this thesis relates to 
that of a clinical trial and analysis of failure time data for a 
clinical trial. The principal aims are to show that for this 
particular application, within the limits of controlled experiments 
how concepts such as control of concomitant information, exploratory 
approach in analysis and that of study of association between 
various risks may be employed to provide a better understanding 
of the data. 



1.1 HISTORY. 



In 1693 E. Halley the well-known discoverer of the 
Halley's Comet produced a life table of the population of Breslau 
in Germany. This data was based on the city records and was 
published in the philosophical transactions of the Royal Society 
of London, with the title of "An estimate of the decree of mortality 
of mankind, drawn from the curious tables of the birth and funerals 
in the city of Breslau." The data was composed of the age and time 
of death and more importantly the cause of Death was specified to 
be small pox or other causes. 

This final small detail on the cause of death in Halley's 
data, later on led Daniel Barnculli in 1760 to reformulate the 
problem. In his paper which was read at the Royal Academy of 
Science, Paris, Barnoulli adopts an ingenious and simple argument 
to derive for each individual, who died of small pox, his determined 
length of life had the risks of death from small pox been eliminated. 

However, the method is based on the assumption that 
the disease affects the total population in a uniform manner, and 
thus the method is not sensitive to the possibility of structural 
variability for smaller subgroups such as, a small subgroup of 
patients being strong and thus more immune from the disease. One 
rather obvious source of structural variability was pointed out by 

D'Alembert (1761,, the eminent French mathematician of the time. 

He noted that the probability of contracting small pox as well as 

dying from it may well be dependent on age. 
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At the time when d'Alembert and Bernoulli were constructing 
the early life tables, mathematical tools had not been developed for 
a more refined analysis. The method is based on a deterministic 
analysis of the numbers in a time period while it does not provide 
a probabilistic interpretation. Further it seems that although Halley 
may have been interested in a functional form of parameters to 
investigate the total population and possibly a population distribution 
(being and astronomer himself.) Bernoulli adopted a non- 
parametric approach at each interval, based on a number of cases, 
to determine the expected values. (The distinction between 
parametric and non-parametric methods will be discussed more 
extensively later) . 

In actuarial studies a similar problem arises where a population 
is measured for the risk of death. At the time of analysis some 
members of the population may not have completed their time to 
the response of interest (death) and therefore no information is 
available on their time of death. By 1825, Probability Theory 
had been well developed and Gompertz (1825) had produced a 
function to approximate such a population survival distribution 
with the above property of some cases not contributing to death 
times. This distribution known as Gompertz-Makeham has beer the 
central theme of many models in actuarial theory. The model 
proposed by Gompertz and further by w.M. Makeham (1875) is very 
realistic in that, the basis of its philosophy is to allow 
separate risks of withdrawals from the population with a response, 
such as death due to a particular cause (e.g. cancer), or due to 
other causes. In fact by ignoring the possibility of different, 
rates/ 



rates of death due to different causes would at times invalidate 
the conclusions of the study. However, the above flexible approach 
allows a check on the assumptions regarding the relevant causes of 
death. The importance of this approach in allowing different risks 
was not introduced into medical studies until the raid-1950's with 
the contribution of J. Cornfield in application to clinical trials. 
Studies carried out as late as 1939 by Bernstein, Binham and Ach 
came to an invalid conclusion through overlooking the problems of 
choice of relevant response rates as the final events of interest. 
Prior to the works of Cornfield, similar developments were taking 
place in another branch of applied mathematics. Emergence of complex 
mechanical devices and early electronic networks required mathematical 
models for a representation of the logical flow of the chance of 
failure, and a final assessment of the probability of failure of the 
system. These areas were named reliability and life-testing. 
At present a major application of these techniques is related to 
development of defence systems in U.S.A. and U.S.S.R. There are 
many similarities between reliability studies and survival studies 
of a population. The conceptual simplicity of the electronic 
systems were partly responsible for the emergence of recent trends in 
multivariate- failure time analysis. Any device may be composed 
of a number of components each with its own risk of development of 
a failure. These components may be in series and thus failure of 
a single component can result in a system failure or may be in 
parallel. A medical example of the latter would be a study of 
kidney failure where damage to one. kidney would not be fatal. The 
basis of this approach in medical studies and population studies 
has been laid by Fix and Neyman in 1951 and Chiang in 1960. 
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As was pointed out a sound methodology had been developed 
by the mid-1950's to apply statistical methods to clinical trials. 
The Epochal Streptomycin trial conducted under the auspices 
of the Medical Research Council first reported in 1948 by MRC 
and later by Bradford-Hill (1962) may also be considered as one of 
the contributors to present trends. What is important about this 
study is its impact on medicine by the introduction of scientific 
attitudes to the study of treatments. Further development of 
methodology in clinical trials was diversified from those of analytical 
methods derived from reliability and life testing to a shift of 
emphasis towards the proper scientific practice of considering a good 
design as a primary aim. 

Within the medical literature Peto et al (1977) proposed 
a major set of guide lines for the conduct of trials. Most of the 
emphasis in their report is on the construction of a well-designed 
trial. For the analysis of data however they adopt a standard 
statistical method for use in a clinical trial. 

Some of the works of J. Cornfield were responsible for 
early application of statistical analysis methods in clinical trials. 
He also pointed out some of the problems of statistical interpretation, 
for example, in the area of multiple risks. Although the framework 
of risks associated with components of a system failure is simple 
enough for mechanical applications, in medicine there are more major 
difficulties. some of the developments of the thesis will be related 
zo these difficulties. 

Finally, it seems that a change of emphasis has taken place. 
In early studies of the development of risk of a disease, most 

applications were on communicable diseases. An epidemic develops 

= r.z 
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and initially there is a high risk of failure (death) from 
contracting the disease. With the passage of time chance- of 
progression of the disease decreases and falls to zero. That is 
for survivors within a relatively short period of time there is often 
a possibility of return to normality. The present context for chronic 
diseases must assume that from the start of the process, failure begins 
and so with any secondary event the chances of death increase. 

1.2 Some Methodological Concepts in Clinical Trials . 

In this section we present some of the special features 
of clinical trials. Basically the aim of a clinical trial is the 
management of the unknown in a clinical setting, so that some knowledge 
or dogma that has been obtained due to historical reasons may be 
refuted or substantiated. The information gained is then useful in 
practic in the administration of treatments. In this respect a trial 
does not differ from an experiment in the natural sciences. However 
any form of a scientific enquiry which involves the collection of data 
within the human environment is open to various constraints. Some are 
related to the impact of the study on the subject under study and some 
are related to the actual validity of conclusions drawn from the study. 
Although none of the above problems undermine the fact that the final 
scientific answer is important, they do make a contribution to the 
quality of the data which is gathered and the role data gathering 
plays in the adrninistrative and ethical areas. From a medical point 
of view the question is not only of legitimacy of the approach in terms 
of how scientific the trial is, but also whether the trial can be 
administratively and ethically accepted. Difficulties in the management 
of the unknown is present in many areas. In other forms of trials that 
may/ 
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may take place outside of medical fields the experimental unit may be 
subject to far greater risks. m fact the introduction of any new 
policy can be thought of as posing initial high risks. within the 
framework of medicine, the problem of risk is due to the rights of the 
individuals and how the uncertain effects and its conclusion may 
benefit the society through the works of institutions. 

In here a distinction may be made between two types of 
risks involved. One form of risk is due to possible progressic 
of disease or expected status of disease over time if there ii 
intervention. The other risk is related to the new method of control 
of progression of the disease with expected side effects over time. 

Depending on the phase of testing of a new drug clearly a different 

level of risk may be present in treatment. 

Three stages have been recognised in the development of 
a new drug. We will in here mention these three phases but the area 
of particular interest for our study is mainly related to one phase 
only and deals with controlled trials. 

Initial study of a new drug is often referred to as a phase 
one trial. There is little emphasis on actual statistical testing 
but more on obtaining insight into acceptable dosage and practical 
limits in administration. 



Next stage is a screening study to assess efectiveness of 
drug under study and its value in performing further controlled 
studies. Finally a phase three trial is the stage where a comparison 
of two or more treatment regimes is needed. 



rhe, 



The phase two trials have been at times the subject of 
controversy as to their place between phase one and phase three. 
Often a balance is made between the level of advance of the disease 
and the risk it subjects the patient to with that of accepted value 
of the treatment. 

The first and foremost motivitation in proceeding with a 
trial is to find scientifically valid answers with the minimum number 
of patients in the shortest period of time. A well designed trial 
has been encouraged from various approaches by many authors. 
Peto et al (1977) consider the roles of factorial designs in trials. 
Simons (1979) considers the role of stratification in design stages 
of a trial and Brown (1980) discusses the role of cross over trials 
although it is not relevant to survival studies. We have mentioned 
these methods for completeness and consider some of them during 
analysis of trials in later chapters. At the centre of these 
approaches lies the principle of randomisation of the patients to 
the various arms of a trial. Randomisation is seen from a scientific 
view to hold a central role. Also it has been received increasingly 
by the medical profession as having an important place in all 
assessments of comparative patient management. 

An alternative to controlled clinical trials is the use 
of historical controls which has found favour in certain clinical 
circles. The latter approach does not resolve the important problems 
of personal bias of the investigator, and the passed on institutional 
dogma. At the present time the value of randomised controlled 
clinical/ 
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clinical trials is recognised by most medical investigators, although 
their proper practice in data collection and interpretation have 
been the subject of discussion, in different situations. Ethics 
and value of scientific refutability form the framework of discussion 
in this circumstance. 

In the past, two general types of historical controls have 
been reported. One group is related to comparison of patient groups 
treated by different methods at different times within the same 
institution, and a second type which allows the comparisons to take 
place across various institutions. Neither of these two methods 
provide a satisfactory basis for allowing a like with like comparison 
of two groups that have been treated by different methods without 
making unjustifiable assumptions. Clearly the problem of final 
interpretation is that, it becomes difficult to distinguish effects 
due to treatment with those of institutional and/or time variability. 
For example, Pocock (1974) has reported the unreliability of 
historical control results from three cancer chemotherapy co-operative 
groups. In this study a comparison is made between similar treatments 
which are used consecutively. 19 such instances were identified with 
the changes in the death rate ranging from - 46% to +24% and with 4 
instances giving a significant difference at the 2% level. The 
phase two trials that we mentioned may at times be defined to belong 
to this class of historical controls. 

If a treatment is found to perform a major significant 
improvement on cure, the weight of such evidence may be so overwhelming 
that a controlled trial is not necessary and thus confounding of 
treatment/ 
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treatment effect and time effect is judged unimportant. Although 
it must be emphasized what may seem very overwhelming evidence to 
ignore time confounding for some is not necessarily overwhelming 
evidence to others at all times. 

Problems of historical controls are not only confined 
to their philosophical position. In practical terms there are 
some further difficulties. Missing information is usually a 
probelm in statistical analysis and the time gap between treatment 
methods does not provide a uniform setting for the recording of 
relevant information. Prognostic indicators are often subject 
to various forms of interpretation and again across institutional 
variability combined with time variability can introduce additional 
bias. in terms of analysis the historical control data analysis 
require relatively more control of various factors. These effects 
will make the analysis firstly more complex and secondly more 
dependent on model assumptions and open to differing interpretations. 
The above were some of the problems of historical controls given 
that patient environment does not introduce its own bias. 
Eligibility criterion, wrong patient mix, adjuvant patient care, 
observers perception of patients final status are all various factors 
that open the ways for introducing bias from medical participants 
in a trial. 

\ 

Although we have put randomisation as the central 
argument of a scientific approach to trials, there are a few other 
issues involved in. a good statistical design. For reasons of 
efficiency and representativeness one can use multi-centre trials 
with reasonable levels of stratifications. Further, depending on 
the/ 
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the form of questions, one may proceed with a cross over or factorial 
design trial. 

For a good scientific conclusion, there is a need to 
organise a trial with a sufficiently large number of patients. In 
order for a trial to be able to detect differences of clinical 
importance between the treatments and be likely to judge this 
difference as . statistically significant, either the period of accrual 
of patients has to be long enough to allow a large number of 
homogenous patients to be. allocated to various treatments, or 
alternatively a multi-centre approach could be adopted by which 
a number of institutions such as hospitals and medical centres 
refer the decision making to a central trials office. The last 
approach at times can lead to an introduction of more heterogenity 
in the total population, due to environmental, varied practice or 
institutions or population structure differences of the different 
areas. In here a distinction must be made between institutional 
variability that is controlled by the randomisation and those of 
historical controls. In controlled trials although extra variab- 
ility is introduced by the institution, the within institutional 
strata randomisation ensures that no bias is involved in the final 
assessment. The long accrual period also has a slight similarity 
with historical controls in that it spans through time. However, 
the distributional variability of the patients prior to treatment 
allocation can be thought of as being more consecutive in controlled 
trials. 

Once a large number of patients are allocated and randomised 
to different treatments, then the patients are followed up for a 
long period. Continuously the patients are monitored for develop- 
ment/ 



ment of patterns of progression of the disease, with respect to 
survival, side effects, disease spread, together with treatment, 
stratifying and concomitant variables. Further it is necessary to 
perform the analysis of the data at various times with up-dated 
follow-up information mainly for ethical and administrative purposes. 
It is likely that at the time of analysis some patients may not have 
responded for each particular time measurement. This effect is known 
as censoring of the survival time for the patients, in that no res- 
ponse is known and survival time is cut off by other events before the 
patient has had a sufficiently long period of follow-up for responding. 
Censoring is a special effect present in study of failure time data. 
A few special problems arise in presence of censoring. The major one 
is related to "lost to follow-up" cases. it is possible that in 
certain trials a group of patients produce a different distribution 
as regards to the number of patients that are lost at time of analysis. 
Such effects are mainly due to administration of the trials and are 
undesirable. In the next section we will deal with censoring in more 
detail. 



The randomisation can provide a good setting for control of 
administrative bias. However it provides no guarantee that differ- 

ences between the groups towards the end of study are only due to 
treatment effects. It is important that together with the formula- 
tion of an a priori hypothesis, a framework is set up so that the 
patients in the two groups are in some sense comparable in terms of 
their known prognostic indicators and follow-up procedures. This 
framework in practice is extended to a protocol that all participants 
agree to conform to. m this way the data collection and interpre- 

tation of effects and some of the clinical practices are standardised. 



From a scientific point of view the emphasis on the better 
design of a trial will clearly enhance the reliability of a conclusion 
that is drawn from a trial. Much of the respectability of hard data 
sciences such as physics and chemistry is attributed to the develop- 
ment of good calibration and development of instruments for proper 
measurement. The development of better recording facilities and 
computer storage and analysis may go in some way to provide more 
uniform standards in clinical assessment. 

Some of the prognostic indicators later form the basis of 
further analysis of survival times. At times such analysis can 
suggest a path for formulation of a new hypothesis. In here there 
is a need to distinguish between two forms of questions that may 
arise. All the above discussions have dealt with the value of a 
treatment in terms of the individual survival times. However, other 
failure time indicators related to progress of disease, side effects 
and changing prognostic indicators at times can be used to provide 
information on the biological nature of the treatment and disease. 

This latter distinction between the two types of question 
is made due to the recognition of the fact that trials are not experi- 
ments in the pure hard data sense of the word. What may be termed 
in the 'hard' sciences as data dragging and problems of multiplicity 
may justifiably be recognised as locally valued exploratory data 
analysis in the clinical trial data context. The problem is that 

what is often considered as valuable research is related to the 
unknown and it is in the area of the unknown where clinical judgement 
may be thought to be at its strongest value. This type cf explor- 
atory/ 



atory analysis therefore can provide a framework for reduction of 
the data and secondary analysis. Part of the benefits of local 
exploratory analysis will be in the formulation of new hypothesis and 
part of the benefit may be in terms of an improvement in the quality 
of the data that is collected. However it must be emphasised that 
a proper placing of secondary (exploratory) analysis is achievable 
only by a utilisation of diverse and relevant methods of analysis. 

1.3 Tr ends, Philosophy and Ethics . 

In the previous section an overview of the main topics 
of clinical trials was given. In this section of this chapter 
some trends and developments in the light of the present definitions 
will be given. Clinical trials play an important role both in 
terms of ^the value of the information they produce and in their impact 
on the general public. They introduce problems of ethics in a 
situation where there are conflicting interests and risks involved. 
Further, to resolve the real problems that exist and to arbitrate 
between conflicting risks and advantages we use scientific method- 
ology. This is at a time when the distinction between science in 
its pure sense of the work and its applications are diminishing. 

In the previous pages, we discussed the setting within which 
a trial is performed and we touched on a few topics that determine the 
design stages of a trial. We will now continue with the quality and 
form of the data that arises and the type of information that is 
considered to be essential for providing an answer to the questions 
on trials. 
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The. minimum data required for the analysis of a trial is 
the information on treatment allocation to the individual patients 
and the survival distributions at the end of the study. a slightly 
more elaborate analysis may also require auxilliary covariate 
information on the prognostic indicators. In the course of this 
thesis we will mention some of the established methods for an 
extensive analysis and concentrate on the proportional hazard method 
of Cox (1972), read at the Royal Statistical Society. 

The proportional hazards model and some of the recent 
extensions constitute a major development in the methods of analysis. 
The model allows a comparison of the history of the disease by use 
of prognostic indicators that may change through time. For a statis- 
tical method of analysis, the approach can allow an expansion of the 
methodology of analysis of event time variability. 

In here we will mention a few recent approaches that have 
been attempted in various fields. Later in the course of the thesis 

we will concentrate on cancer trial data only. 

1. Di prete 1981, Considers a study of duration of employment 
in which adult members of a labour force pass from various 
states of unemployment to employment. 

2. Hannan, Tuma and Greenveld 1978, consider effects of income and 
other effects on the periods of marriage and divorce. 

3. Hannan and Carroll 1981, study of effects of various character- 
istics in society that lead to various forms of government and 
the times of remaining in one political status. 

4. / 



4. Crowley and Hu 1977, study heart transplant data and various 
characteristic variables in determining survival times . 

5. P.K. Anderson and N.K. Rasmusson (1982) consider times of 
admission of a group of women attending psychiatric hospitals. 

Although the above studies arise in different settings, 
all deal with the progression or development of a process through 
time. This parallels the progress of disease in time and possible 
events that may occur in this process. The emphasis in here is 
not so much that of desirability of the approach in a clinical 
setting but more in dealing with practicality in providing a 
flexible model for the interpretation of the data. 

The need for organised experimentation arose in the 
natural science due to a need to replace occasional fragmentary 
experience with harder unbiased evidence. In such contexts the 
experimental unit is an inanimate object with no morals, collective 
memory or values. The need to perform experiments on human 
subjects in general arises out of a wish to answer important 
questions on the nature and treatment of various diseases with 
some degree of scientific and ethical accountability. The final 
result is scientific and technical progress for the benefit of 
society. In the biomedical fields in particular the institution- 

al demands and individual rights play .a major part in the final 
outcome of the study. In general two types of experiments are 
identified in this context, therapeutic and non-therapeutic. We 
will now give a brief description of the two. 

Non/ 



Non-therapeutic experiments are primarily performed for 
the purpose of gaining new knowledge and not so much for reasons of 
benefit to the subject. An example is the use of healthy human 
volunteers in early phases of drug testing. 

More important are the therapeutic experiments. The 

primary aim is to benefit the patients by intervening in the progress 

of a disease. However similtaneously the intervention is organised 

in a controlled manner so that a valid scientific conclusion may 

be possible at the end of the study. On the scientific importance 

of such trials, M. Baum, R. Kay and H. Scheurlt-n (1982) have written: 

"Over-enthusiastic and uncritical adoption of a conceptual framework 
by some clincians has led to therapeutic dogma and consequent 
erection of new ethical constraints. Factors outside the control 
of the clinicians which are active in hindering progress are an 
increasing public awareness of the problem, the clamour for informed 
consent, scrutiny by the legal profession, the involvement of nation- 
al government agencies and the escalating costs of treatment. Those 
developments also force us to reconsider the scientific fundament- 
als of clinical trials as opposed to other approaches to scientific 
questioning". 

The key word in statistics is information and evidence 
and it has always dealt with 3 practical problems. What are the 
assumptions of analysis? What are the assumptions of collection? 
and finally, how relevant is the data? The above problems are 
particularly relevant in trials in that results may not be know for 
a long period. As far as the attitudes of the clinicians involved 
in treatment and measurement are concerned, changes may take place. 
This may result in premature withdrawal from a trial with the result 
that the objectives of the. trial are not fulfilled. Alternately, 
their assessment of patients may change over a period of a trial. 
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This final remark will be emphasized to some extent in 
the course of the thesis on the effect of varying definitions such 
as progression of the disease that may arise. These changes of 
concept may affect the clinicians from many directions, from those of 
personal motivation to be right to those of individual responsibility. 
The final effect is that there is potential for conflict between the 
scientific objectives of the trial and the subjective decisions of 
the clinicians. In here science is dependent to some extent on the 

background assumptions. In the physical sciences, performing standard 
uniform methods of measurement is possible, but in a clinical 
setting even with a willingness to conform systematically with 
the protocol, the measurement will not necessarily be free from 
preconceptions. One further difficulty mentioned in the last 
section is human involvement as an experimental subject and the fact 
that individual rights are at the forefront of any responsibility. 
There are different modes of ethics present. First of all cancer is 
a problem and it is ethical for our institution to find the relevant 
answers. Also it is ethical to utilize resources efficiently and 

be aware of their value and obtain relevant inference. Further, 
there are clinical ethics based on the personal judgement of the 
physician and finally there are interests of the individual and a 
choice preference he or she may want to excercise. 

s 

In here a difference exists between the observational 
requirements of the natural sciences and the ethical attitudes of the 
individual physician, mainly due to the limited form of information 
available to them at a time. For example during the progress of a 
-rial a physician may gain the impression from incomplete data that 
zr.e 



one treatment is more successful that another, posing him an 
ethical dilemma as to whether to continue with the trial or withdraw. 
It is difficult to consider any of the above mentioned problems 
in isolation from the role of computer and information networks in 
the developments of future procedures. Science as a common 
arbitrator is confronted with many information techniques ranging 
from multivariate statistical methods to those of data base manage- 
ment systems. A general and undisciplined use of the above methods 
would lead to an increased likelihood of eithical conflict. On the 
other hand a utilisation of relevant methods of secondary analysis, 
in the correct context and specified fully by a protocol in the 
beginning of the study may contribute towards a better participation. 
With respect to the role of feedback of information, Prescott (1978) 
based on patient entry into Edinburgh trials, indicates that with 
a feedback of information it may be possible to maintain the level 
of interest in a multicentra trial. 

1.4 Definitions and Mathematical Functions . 

In this section we will develop and define some of the 
initial concepts in survival or failure time analysis. Before we 
commence with various definitions that we need in this thesis it 
must be emphasised that the titles survival or failure time analysis 
are a little misleading in that basically we are interested in an 
analysis of progress of various events in time and this event in time 
need not be death or regression but can be discharge from hospital, 
or any other event not necessarily representing a failure. 
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In the study of survival time three mathematical functions 
are often used. These are survival function, hazard rate and the 
density function. These functions are in fact different transform- 
ations of one another. For reasons of interpretation however a 
particular function is usually used and in the course of the thesis 
we will mention certain practical advantages of each. 

For all of our cases we have a time t- available which 
in the observed period for that case until a particular event of 
interest for example death. Clearly fci is always greater than zero. 
We define for the density function of T the function f(t) and for 
the distribution function F(t) as is the usual practice in the 
statistical literature. We can thus define a more useful function 
for these applications, namely survival function S(t) giving. 

S(t) = 1 - F(t) = Pr {T>t} = Pr (survival for a case exceeds t) 

Also 



(and as usual 



im -if s ( t, 

f(t)dt = 1 ) 



Another useful function is the hazard rate or hazard function. 
In epidemiology, this is named as a force of mortality. We have 
here the hazard function given by: 

X(t) = Pr {t < T <t + dt T\> t } 

_ pr ^death in a small interval/g'iven survival ; 

dt Sup until time t ' 

We/ 



We can now explore the functions in relation to each other. 



du = 



f (u) 
S(u) 



du = [ - log {1 -• F(u) } ] 



= - log [ 1 - F(t) ] = - log S(t) 
which implies the important relationship 

/f X (u) du 

S(t) = e 

These concepts have been defined here for a continuous case but can 
be extended to discrete form of T. 

In practice what distinguishes survival analysis from 
most other branches of statistical analysis is that at the end of 
the study or at the time of analysis we dc not .have a failure time 
for some of the cases. That is we know that they have survival up 
until the last follow-up and also know that they will fail in the 
future. This effect is known as censoring of the failure times 
and will be discussed for the rest of this section. 



For each case we will have a time y^ or c^, available, 
indicating resprctively that the observed time was terminated by a 
failure or that the case has not had enough follow-up time to produce 
a failure. In industrial applications two types of censoring namely 
type I and type II are usually used. koth of these types of censor- 
ing imply that all cases are put on trial simultaneously at time zero. 
If/ 



If a fixed maximum time of failure is considered sufficient before 
the end of a trial we will have a type I censoring and if the 
stopping criterion is taken to be the ratio of censored to sample 
size we will have a type II censoring. An example is the situation 
of monitoring a set of light bulbs on time. We will not develop 
these concepts any further but continue with a form of censoring 
that will be used later in the thesis. 

In biomedical applications a different type of censoring 
is produced by the data and usually named as random censoring. 
Patients are entered into a trial at different times and then are 
observed after treatment for a number of years. We therefore have 
a time t^ for case i and it is, 

t. = Min (y. , c.) , that is we observe either censoring 

or failure whichever is first. 

, t^ is censored 

, t^ is not censored (This notation 
will be used in the develop- 
ment of various models) 

In practice some further complications arise. What we 
have defined to be death or censoring can be in fact a subset of 
a final outcome of a more complex process with more end points. 

For example at times a patient decides to leave the 
geographic area within which a trial is prepared and thus the case 
is a lost to follow-up. At times terminations other than one 



and 



i if y. > c. 
v 1 1 

( 1 uf y. i C. 



of interest occurs; say a death from a second illness or a car 
accident and thus the final result can be open to different 
interpretations . 

Generally we assume that censoring times are independent 
of death times. This assumption is quite valid for most trials. 
However if dropping out of the treatment is more common for one arm 
of trial the effect is at least loss in efficiency and more 
seriously a possible introduction of bias . In the thesis we will 
also discuss the possibility of analysis of data with more than one 
type of failure and in these contexts certain types of dependence 
on death times and censoring can be tested. 

1.5 Outline of Thesis . 

Ethics and certain scientific stands give trials properties 
that are slightly different from scientific experimentation in 
the natural sciences. The role of large, cheap and accessible 
computer information banks and fast end processing is new to this 
area and is changing the statistical methodology which can be 
applied. The recent developments in the field of failure time 
analysis originating mainly from Cox (1972) and his proportional 
hazard models are the main subject of our discussion. 

We will study the applications of this model to clinical 
trial data with various forms of interrelations. The variety of 
interrelations will be defined tg be both in terms of covariate 
effects and actual events where more than one event is present on 
the time scale. Further we will study the flexibility of the 
method/ 



method in dealing with the different forms of interrelationship 
that arise in situations where the regression parameters are 
not necessarily fixed and their influence can best be described 
in terms of a process through time rahter than a cause and effect 
situation. The major emphasis of our. discussion will be on the 
exploratory use of the analysis and the variety of the models 
available in the framework of proportional hazards. At times 
when the limitation of the proportional hazard method makes them 
inappropriate for example in the study of the actual distribional 
shape of the hazard rates, we will discuss altervative methods. 

In the context of the proportional hazard models with 
intervening events and time dependent covariates, there is a 
deviation from the traditional regression approach. Proportional 
hazards do not provide the same restrictive assumptions in the 
distinction between the exogenous and endogenous variables (in 
this framework fixed covariates and final response times) . With 
the use of this extra felxibility a greater number of models are 
available for analysis with the proportional hazard assumptions. 
The need for this flexibility aims at a different interpretation 
of cause and effect, and more on the interpretation of the 
structure of change. 

Although from a scientific point of view it is 
desirable that all measurements are made under uniform conditions 
throughout the trial, it may not be possible to isolate totally 
clinical judgement which may change with experience, from clinical 
measurement. Further the individual behaviour of the patients 
and also the long time scale in data collection may also play a 
role/ 



role is the different pattern of development occuring for 
different subgroups of patients. We will not analyse data 
according to all of the above possibilities. However time 
dependency does provide a good construct for such an analysis. 
We will be looking at certain aspects of time dependengy within 
the thesis. 

In outline the structure of the rest of the thesis is 
as follows:- In chapter two we will consider the non parametric 
methods and their advantages. Also in this chapter we will study 
a general group of tests that have been used in the analysis of 
trial data. Chapter three deals with parametric methods and 
the various advantages and the disadvantages of the exponential, 
Weibull and a few less known but more complex distribution. 
In chapter four we deal with the semi-parametric proportional 
hazard of Cox (1972). We will consider various regression 
forms of the proportional hazard models and consider its position 
in relation to different parametric and non-parametric methods. 
Chapter five considers a realistic simulation method for the 
generation of clinical trial data. These simulations are 
carried out for treatments and one covariate parameter in the 
presence of proportional hazard assumptions and deviation from 
it, using the various approaches of analysis described above. 
Chapters six and seven consider and analysis data from a 
clinical trial which was organised ii\ Edinburgh. In particular 
in chapter seven we will consider multivariate approaches to 
survival analysis and the effect of different intervening events 
in the patient progress. Chapter eight considers a different 
: = :a/ 
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data set for the purpose of study of various prognostic 
indicators and the use of time dependency in prognosis. 
Finally, in chapter nine we will bring together the findings from 
the earlier chapters. 



CHAPTER 2 



NON- PARAMETRIC METHODS 

2.1 Initial Developments of Life Tables . 

In this chapter we introduce the non-parametric 
methods of analysis of survival data. These methods are closely 
related to the life tabel method originally proposed by 
Halley (1693) and which was mentioned in the introduction. 
Such life tables according to their particular applications 
have been refered to as population life tables, clinical life 
tables and cohort life tables. We do not intend to discuss 
the difference between the applications but to concentrate on 
the clinical life tables, because of their relevance to failure 
time data. In here however w; generalise the area of applications 
by rephrasing to length of stay in a particular state? for 
example time from entry to a hospital to time of death or 
operation. 

Some of the developments outside the field of clinical 
life tables such as those of competing risks are relevant to 
zuitiple failure time analysis and we "refer to these methods in 
rr.apter 6, as multivariate competing risks. There have also been 
developments in which some of the methodology and techniques 
.-.in ally used in the analysis of failure time data have found 



us* in life tables outside o£ clinical studies. An example is 
Breslow (1982) on the use of Cox's method for cohort studies. We 
will return to this area again later in chapter 4 when we allocate 
a chapter to Cox's approach. Although at times we refer to 
similar developments in neighbouring fields we concentrate on 
applications to clinical trial data methodology. Two other types 
of life tables that are used extensively in other applications are 
population life tables and cohort tables. The population life 
tables require two sources of data. These are (a) census data 

on number of individuals alive in a particular age group and 
(b) vital statistics on number of deaths in a given year for 

each age group. Cohort studies on the other hand concentrate 

on describing the actual survival experience of a group born at 

about the same time. 



In clinical life table data we use data from a group 
of patients and the data refers to entry to a particular state 
until removal from that state. The. nature of removal from the 
state however often has to be conditional, either on removal due 
to response, e.g. death, progression of disease or it can be 
a removal due to withdrawal, censoring, death from other 
causes etc. Further we are interested in a comparison of 
two or more treatments and thus the analogy with population 
studies is not carried further. In population studies one is 
oten comparing a survival rate of a group with a rate from 
census data or vital statistics, much of . which is historically 
based information. In trials however, we refer to two arms of 
a trial. The comparison of interest is performed based on a 
measure/ 



measure of difference between rates of failure between the two 
treatment arms. There are some trials based on more than two 
treatment options but the principle of the analysis is the same. 
In terms of structure however the method of clinical life tables 
as proposed originally are similar to population life tables 
in that the data is grouped into intervals and that the 
probability of survival is estimated for each time interval. 
Chiang (1966) produced variance estimates of the probability 
of survival at any fixed point in time for describing the two 
treatment groups. Later it was discovered by Kuzma (1967) that 
these estimates can underestimate the variance of survival 
probabilities quite significantly if the censoring percentage 
is high. 

Better estimates of survival rates can be obtained by 
use of parametric methods, given that the relatively restictive 
assumptions of the parametric distribution of interest are not 
violated. This conflict of interest between robustness of an 
estimating procedure versus its efficiency is part of the 
disussions in chapter 3 and 4. 

2.2 Product Limit of Survival Times . 

In this section we will describe the product limit 

\ 

estimate or the Kaplan and Meier estimate and later show that by 
the method of Johanson (1978) that the product limit estimator 
can be derived. as maximum likelihood estimators. The product 
limit estimate of survival for n observed response times and 
censoring/ 



censoring times was initially proposed as a descriptive method 
rather than a method of inference. However recently it has become 
the most commonly used method of estimation of life tables in the 
context of clinical forms of survival data. 



The product limit method is different from the methods 
of the previous section in that rather than using a fixed time 
interval it is based on forming a rank set of survival times in 
such a way that, for equivalent death and censoring times it is 
defined that the censoring times should have a rank greater than 
its equivalent death time. The product limit estimators are of 
special interest in that they form the basis of a large number of 
non parametric tests and are closely linked to the proportional 
hazard model. We will now proceed with a derivation of the 
product limit estimators using the maximum likelihood estimation. 
Throughout we will assume a continuous time scale so that there 
are no tied events for each rank. All results however may be 
generalised to tied data with slight extensions. 
First we order the survival data into a rank order. 

t (1) < fc (2) " 



Further for each t , . , there exists an indicator variable »v,. such 

(i) U) 



-hat, 

( 1 if t (J1 is a response 
= ( 



6 _ - (i) 



( if t . ^ is a censored observation. 



-e then define 



P^ = P [ Surviving at t^ given survival until 3 



= p rT > t \ T £ t 1 

1 c (i) \ Vd j 
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giving 

1 - | if 6 = 1 

P. = { i ' where n . = n-i+1 , i = 1 . . . n 

if 6 (i) = ° 1 (2.2.1, 



Frora definition of chapter 1 we obtain 

6 

s (t) = n p. = n d - - ) 11 
t (f) t 1 fc (if t 1 

n n - i . (i) 



Then the corresponding estimate of standard error is 

(n - i) (n - i + 1) ' 



6 . 

s.e. [S(t)] = S (t) { 2 



We will now express the survival distribution in terms of a 
likelihood function based on produce limit estimates and show 
that (2. 2.1) can in fact be considered as the maximum likelihood 
estimates of the likelihood function. 

Likelihood L = (terms due to cases dead) (terms due to censored 

times) 

n 6 1-6 

L = n Pr [T = t ] U) Pr [T >t ] W 
i=1 {1} {1) 

n n 1-6 

n [Pr (T ■■■■ t ) (i) I Pr (T = t ) 
i=1 {X * j >, i ( ' 

We let be probability of an event a\ T giving 

n 6,., 1- 6,.. 

L = H R. (l! [ E R.] (:) 
i=1 1 j >A 3 



we can thus define the hazard rates conveniently as 



n -1 P 

X. = r. [ 2 R.] with r R. = 1 

1 j U 1 3 = 1 3 



we thus have 



R. . J. . R. 

, _ X. = ! i = 3 U*l 3 (2.2.2) 



l 



. Z. R. . Z . R. 

1^-3 1 U 3 



and 



2 R. = (1-X.) 2 R. = (1 - X.) (1 - A ) 2 R . 

j 3 1 j=i 3 1 i-1 j=i-1 3 



n 



= (1 - X. ) (1 - X ) . . . (1 - M 2 R 
1 li - j = 2 J 



= (1 - X.)(1 - X.^) ... (1 - X 2 (1 - X^ 2^ R 



n 

j = 1 "j 



X . 



i-1 n 
= H (1-X.) since 2 R. =1 (2.2.3.) 

j=1 3 j=1 3 



R. R. 
l i 



i 2 i-1 

n 

j= 1 



, 3 :i ^j) 



lag r. = x. n d - x .) (2.2.4) 



i-1 

n 

j=i 

expressing the likelihood in terms of the hazard rates 



n i-1 6 i-1 1 -S 

n(x. n {i-x.}) (1) ( n n-x.} ) 

i-1 1 j=1 3 j=1 3 
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n 

i=1 



[ h 



(i) 



i-1 

if , 11 - v> i 



5 ft) 5(a) 



= x 



l O A 2 



X 2 (1 -XJ o X 3 (1- (1 - X 2 ) . 



8(0) 



(1 -X 



n-T 



8(n) 

Biit X = 1 by definition 



Therefore 



' (n) )(1 -X^^^d -X 2 ) n " 2 n-X n _,) n - (n - l \ 



(1 - x n ) 



n-n 



n--1 5,., 
n x (1) (1 -x.) n ' 1 

i=1 i 1 



Now we take logarithm of the likelihood in order to obtain a 
-axima with respect to X . 



6 Log L 

S X. 
l 



lliL + (" - i) n ( - 1 ) = o 

i i 



d - + (n - i) (-"-)X. = 

\ 



X . 



6 (i) 7 (5 (i) 
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But is either or 1, giving 



x. = 5 (i) 

1 1 + n - i (2.2.5) 



For the product limit estimator by definition & (2.2.5) we get 



R. = 
l 



l 



i-1 

n 

j-1 



(1 -Aji 



(i) 



1 + n - i 



i-1 

n 



1 - 



:i) 



n - j + 1 



Thus by definition of. &\t) we have 

t - , ■ 'n 1 ( " ~A ,) (2.2.6) 

l 1 + n - l j_ 1 n - 3 + 1 

Therefore we conclude that the R. are the required maximum likelihood 
estimators of survival times. 



2.-3Nonparametric methods for two treatments 

First we consider the log rank test, Peto (1972 a), which is also 
named mantel, Mantel -Haenzel, Mantel-Peto-Cox, or Savage-Mantel- 
?eto-Cox statistics. In this method, which is based on the observed 
and expected values of numbers of events in a particular time (under 
the Null hypothesis) we derive a form of chi squared test which is 
indirectly related to the ranks of survival times. The ranks are 
then transformed to a comparative ratio of numbers responding and 
numbers at risk. By this method any probability value that we 
obtain is used for the main objective of discrimination between 

treatments./ 



treatments. This enables us to infer that the difference between 
observed and expected values of the survival rates is either compatible 
with the Null hypothesis of.no treatment difference, or that it is 
due to the effects of the alternative hypothesis that there are 
treatment differences. There are certain assumptions necessary for 
an analysis based on the log rank test. Later we will compare these 
assumptions with those of the Wilcoxon test and present a general 
form of test which incorporates both tests as special cases. At this 
stage however we only mention that the method can be derived and is 
related to Cox's proportional hazard model of Chapter 4. 

Initially the same procedure as that of Kaplan and Meiers 
is used to transform the survival times. Similarly a vector or 
survival times is obtained based on 

V) 

Thus at the beginning of each of these time points, say t^ we 
form a 2 x 2 table to categorise the total number of patients at 
risk, according to treatment grouping and status at end of t. period. 







Observed events 






Number in group 


(deaths) in 


Alives 




at time t^ 


group at t^ 




Group 1 


% 




»1J " V 


Group 2 






<N 2j " V 


Total 


N . 

j 


■l 


m i - V 



n; 



< t 



(2) 



N . . 0. 

Then we have a contingency table giving E^ . = — ^— — — as the 

13 3 

expected number of responses at t_. in group i. 

We thus represent the above as a 2 x 2 x r table with the log rank 
statistics 



LR 



Z V. 

j-1 3 



= ( - E) 



V 



(2.3.1) 



Using the hypergeometric distribution and the corresponding 
moment generating functions we have the first two moments giving 



E. . = 
13 



N, . . 
3 

N . 
3 



n . n_.o. (n. - 0.; 



n? (Nj - r 



which can be used in (2.3.1) 

2 

For a single level X test where r = 1 we can then present a 

2 

X for a single level of j giving 



N j [N 2j - ° 2j ] - °2j [N 1j - ) 

N 1j N 2j °j (N j " °j ) 



Now by referring to the table of the chi square distributions with 
one degree of freedom we can accept or reject the null hypothesis of 
equality of survival rates for the two treatment groups against the 
alternative of different survival rates. 



The log rank test is based on the Kaplan and Meier estimates. 
It acts indescriminantly in combining expected rates of number of 
failures. Like the Kaplan and Meier estimates the expected values . 
of numbers of events in each category is obtained by a ratio of 
numbers/ 



numbers of events by the number at risk. However in some 
circumstances a more efficient estimation of the survival 
differences may be possible if a weighting is attached to the 
expected number of events. We will consider those conditions later 
in this section when we deal with Gehan's generalisation of the 
Wilcoxon test. 

The special property of the Wilcoxon test is that contin- 
uously the constribution to the likelihood is weighted by the total 
number at risk at t^- This statement is analogous to a special 
form of time dependancy in proportional hazards. In terms of 
interpretation however the null hypothesis is slightly different 
between the two tests in that for the Wilcoxon test the null 
hypothesis is based on the equality of the survival rates between 
the two groups together with equality of the censoring rates. 
In the log rank test this latter assumption is not required. 

Thus in the Wilcoxon test early events are weighted 
slightly higher than late events. The log rank test may be 
expressed in vector form by - 

X 2 = (0 - E) ' [V]" 1 (0 - E) 

The notations for E, and V are expressed in matrix form 

below for a similar expression of the Wilcoxon test; equivalently 

the Wilcoxon test can be expressed as' 1 ' - 

[I 1 (0 - E) ] ' [d ' V H]" 1 [ l ' (0 - E) ] 
where for a comparison of two treatments we let, 

l' = • • -V °' = <°1T • • ^r 1 ^' = (E 1T-- E 1r ) 

and/ 



and 



' V 



Further I. is set to N. numbers at risk at t,., 
3 3 (3) 

The above formulation was first used by Tarone (1975) on a 
test for departure from trends. Tarone and Ware (1977) show that 
the difference between the logrank rest and the Wilcoxon test is 
in fact due to the choice of weights as a function of the number 
of individuals at risk at the time of each death. Once again with 
one degree of freedom we have a chi-square , 

2 W 3 - V 2 

X TW =2 ^~r (2 - 3 " 2 > 

/ E 2„ 
. w. V. 
3=1 3 3 



where 



V. = N , . N 0. (N. -0.) / 2 

3 id 23 3 3 3 N j ( N j - 

and E. = N.O. 

1j 13 3/ Mj 



Thus this general result gives the logrank test for W_. = 1 and 

the Wilcoxon test for w . = N . . Taron and Ware suqgest a 

3 3 

different function of weights, namely w^ = /N^ and claim that 
it has better efficiency over a range of alternatives. 



The above approach is clo&ely related to time dependency 
s-aling of the proportional hazards. The Wilcoxon test considers 
—e distribution of censoring times as well as death times. There 
.£ -owever no reason why w_. must be defined as a function of N_. 
slooe/ 
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alone. In fact later in chapters on proportional hazards we will 
consider time dependencies as a function of metastatic or other 
intervening events. 

Now we expand the logrank and Wilcoxon test to the 

multivariate situation. For this purpose the general Tarone 

and Ware statistics generalisation is used. We will continue with 

, E . . and V. . presentations, 
ij 13 ij 

In cases where there are a number of subgroups we 
present for say a set of different levels of an outcome, the 
following formulations. 



Level 


d o 


d i 


. . . . d k 


Total at fc^jj 


Event 


°o: 


% 




. 
+1 


at risk 




N 13 


% 


N . 



We then have a longrank type null hypothesis for the equality 
of survival rates to k and a Wilcoxon type null hypothesis for 
equality of all survival rates and also all censoring rates. 

Once again we have a chi-squared test with ( k- 1) 
degrees of freedom, 

X 2 = (0 - E) ' V" 1 (0 - E> (2.3.3) 

(0 - E) = Z Wj (0... - E. .. ) 
= -r.s first 3 moments of the hypergeometric distributions we 



get - 



E. . 



N . . / N 

ij +i ' 



+j 



v 



and 



Z V. . 
1 ij 



Z w . Z V. . 
J 3 i 13 



N +j - 1 



N 



01 



N. 



N 



(1 



+j 



_ 3i 

N .' 



-N .No . 
PJ & 3 

N 2 . 



Mow referring to (2.2.1) and (2.2.2) it can be seen that (2.2.3) is 
a generalisation of the previous tests. Once again w. = 1 gives 
a general form of the logrank tests. w. = N+ . gives the Wilcoxon 
test and w j = /N +j gives a Tarone and Ware type statistic. 



2.4. Stratification . 

In the introduction we mentioned uses of stratification 

in conjunction with randomisation, and considered it to be a proper 
method of conduction of a trial at times. We did not consider 
the necessary analytical techniques in the development of the 
methods. We will now consider stratification methods in conjunction 
with the non parametric methods of analysis which can describe the 
general advantages of stratification. v 



In many trials apart from the treatment assessment 
information an array of different types of exploratory data U 
also/ 



also collected on patients, often referred to as prognostic 
indicators or covariates. A few examples of each data that we will 
be referring to are , age, node status, size of tumour etc. This 
type of covariate information is a reflection of the underlying 
make up of the group of patients to whom the inference is relating. 
A proper randomisation in a large sample would imply that the patient 
variability between the two groups are suitable. In some trials 
however purely leaving the allocation of patients to randomisation 
may not provide a satisfactory final outcome of the patient mix. 
In practice the type of adjuvant care or therapy can be dependent 
on the prognostic conditions of the patient; this condition can 
provide a framework by which the two arms of the trial are not 
comparable. An example is a situation where the amount of radio- 
therapy given may be influenced by size of the tumour, and thus 
the size of the tumour may mainly influence the survival rates 
of the two arms of a trial. In other situations where there is 
a perfectly standard treatment for all patients, it may be known 
from preset that a group of patients that have less advanced 
disease, will be generally better in survival regardless of the 
type of treatment. Such differences can lead possibly to a 
correlated prognostic and treatment effect and furthermore may 
bias the inference. The remedy is often a prospective stratif- 
ication. The utilisation of stratification has been subject of 
some controversy. Peto et al (1976) considers stratification 
often as unnecessary and unjustified administrative inconvenience. 
The basis of this view is that for large trials often the gain 
in power of tests is nominal where randomisation guarantees 
comparable/ 



comparable treatment groups. An alternative view which is m 
favour of stratification considers, firstly small trials to be 
common in practice and an important part of research, secondly for 
large trials an interim analysis can be based on small numbers of 
patients which consequently may condition the conclusions on the 
type of patient mix. 

It should be pointed out that although stratification adds 
a form of control on the randomisation procedure it in no way 
influences the chances of treatment assignment to a treatment arm. 
Apart from stratification at design the relation between stratif- 
ication and analysis is also important and at this point we can 
make a few comments which can also apply to the methods that we 
will be considering later. In either of the situations where 
the sample size is large enough to achieve a balance of treatment 
arms in terms of prospective effects, or the situation of 
stratified trials and balanced prognostic effects, it is useful 
to account for any of the possible survival differences that 
as a priori is considered relevant to the trial. There are two 
r.ajor aims for this type of analysis which may not have been so 
clear from the discussion of stratification and design. Firstly, 
-e may aim to study the whole patient group and so it may be of 
some importance to know characteristics of prognosis in different 
Btxat and account for heterogenity of patient survival rates. 
However, care is needed in the interpretation of such assessments 
izz it is possible that any such inferences are conditioned on 
sample size and/or other prognostic effects. Secondly, it is 
rrssible to use prognostic factors to define the treatment 
: : — arisons/ 



comparisons, we may thus obtain a test statistic for each stratum 
of a prognostic variable, in order to compare treatment effects 
within each stratum. In the next section we will present the 
results of a trial data by which we obtain tests statistics by 
considering a fraction of data belonging to a particular prognostic 
category. It is then necessary to obtain an overall observed and 
an overall expected (extent of exposure) comparison. The overall 
test is important in that even if a treatment difference for a 
prognostic group is not significant, the overall test provides a 
test by which if the directions of the influence of the treatments 
are in the same direction they can give an overall influence. 
This overall test will then remove any consequences of a possible 
correlation between prognostic variable and treatment effect. 
An important condition where the above consideration is important is 
when the tests for the different strata in fact do not point to a 
treatment difference in the same direction. Once again we will 
study such effects more carefull in the data analysis section of 
this chapter and the subsequent chapters. 

2.5 Comparative application of non-parametric tests. 

Consider a trial in which a set of treatments have been 
allocated and further stratification has been performed on some of 
the prognostic variables, either prospectively or retrospectively. 
The development of the logrank rank test are then useful in 
expressing group differences in a single statistic. The development 
of the previous section reinforce the notion that the logrank test 
is a useful test for trials and further that it can be set within 
= larger theoretical framework. 
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In practice we are interested in a comparative study 
of the general expressions of (2.3.1), (2.3.2) and (2.3.3) as 

applied to a set of clinical trial data from Edinburgh. At this 
stage we will consider the relation of the special forms of the 
w vector of the last section to the various hazard rates of the 
prognostic group. We will not deal with inferences drawn for the 
shape of the hazard rates, since chapters. 4 and 6 models are more 
appropriate for this. The actual data will be discussed in greater 
detail in chapter 7, with some history of the topics and problems 
related to the treatment of breast cancer. 

Basically the data consists of 561 cases treated for 
breast cancer either by radical mastectomy or simple mastectomy 
plus radiotherapy to the axilla. 

In here we record the survival times only and the primary 
purpose in the use of this data is to assess the relative merit of 

the arms of the trial for the total group of patients and then 
according to various prognostic factors. In chapter 7 we deal with 
the situation of more than one response variable and consider 
intervening even ts,. such as development of local and metastatic 
disease. 

The Wilcoxon as was explained attaches more weight 
to early events and thus gives a slightly different chi-squared 
value to the logrank test for most of the groups in our data. 
Earlier we presented these tests as 2 x 2 x r contingency tables. 
In fact the logrank is the most powerful test given that the 
second/ 



second order interaction is negligible. 



In general there may be more than one set of independent 
variables acting and thus we will perform an analysis baaed on the 
various subgroups of patients. Thus our data can be expressed 
by various probability values, related through a likelihood 
function by the following formulation. 

Likelihood (particular subgroup) = n terms due to cases dead 

= II terms due to cases censored 

= II terms due to time dependencies. 

In the above likelihood formulation «*e have introduced 
time dependency. It is difficult to establish a complete meaning 
of time dependency without resorting to empirical hazard rates. We 
will do so in chapter 3. For the present section it is important 
to consider a comparison of the Wilcoxon and the logrank test, using 
a trial data. Such a comparison is intended to serve as a 
representation of the effects of the two tests\for different hazard 
rates. As we will indicate in the discussion of the data the 
two tests produce the same interpretation of the data. On the use 
of time dependency however we confine their difference to that of 
having a different a vector of weights in the overall X 2 test. 
The effect of such weights is of importance only if the 
variability of the difference of the proportion of rates is 
:f relevance. 

In an extreme situation that is rarely detected in 
practice one may encounter crossing survival rates. 
Erwever/ 
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However in the comparison of the logrank and the Wilcoxon test 
any time dependency if it exists will be reflected by the influence 
of late events versus early events. For the purpose of estimation the 
logrank and the Wilcoxon test in fact ignore the last term of the 
likelihood. In fact practically for most situations one can assume 
that the effects due to the last term of the above likelihood are 
negligible. The slight difference that we will detect for the 
logrank and the Wilcoxon test is due to the structural differences 
between the two tests. This structural difference however is 
essential for the power of the tests in the presence of the most 
relevant alternative hypothesis. In the chapters on proportional 
hazards, time dependency effects can in fact be tested directly by more 
suitable methods. 

In the first steps of the analysis we will obtain the 
product limit estimates of the two treatments and the corresponding 
hazard rates, Figs. (2.5.1) and (2.5.2). The mthod used for the 
plot of the hazard rates is described by Johnson and Johnson (1981). 
We use a grouping period of 30 months for all the hazard rate plots. 
3y the use of the logrank test and then the use of the Wilcoxon test 
we obtain the probability values for the difference between the 
two survival rates, Table (2.5.1) 

No. of No. of Expected ^2 ^2 df" P P 



Radical 
Surgery 



cases Responses Responses LR w LR w 

288 135 152.05 



Simple 

Surgery 273 161 133.95 

- XRT 



10.04 12.23 1 .0015 .0005 



Table 2.5.1 
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Later we use the modified version of the logrank and Wilcoxon test, so 
that apart from obtaining the difference between actual survival 
causes of the total treatment groups, we also obtain an overall 
treatment comparison adjusted for prognostic variability. In the 
process of obtaining the adjusted comparisons, a comparison for each 
level of the prognostic indicators is estimated and the final adjusted 
comparison is based on weighted differences of the observed and 
expected values of each subgroup. 

The primary purpose in the comparison of each logrank 
statistic and the Wilcoxon test is to study a difference in their 
corresponding chi-squared and probability values. Further we 
examine, the shape of the hazard function and the association between 
the patterns of differences between the rates of failure and the way 
in which the Wilcoxon test puts more emphasis on early events. 

Another manner of looking at the effects of time scale 
-•ill be done in Chapter 7 by use of the regression like models of the 
life tables. In these models we relate the shape of the hazard and 
Be time dependency indicators. 

The prognostic indicators that we use are namely, Age, 
Sode, Stage, Size and Menopausal status. 
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No. of No. of 
Cases Deaths 



Premenopausal 163 59 
Menopausal 38 20 

Postmenopausal 359 216 



Expected 
No. of 
Deaths 

98.53 ) 
i 

20.26 ) 
) 

176.21 ) 



LR w LR w 



25.12 21.97 



.000 .000 



Pre & R 
Pre & S 
Meno & R 
Meno & S 
Post & R 
Post & S 



Mode Status 

N0 

N1 

S0R 

N-0S 

N1R 

:ms 



89 
74 
21 
17 
178 
181 



27 
32 
11 
9 
97 
119 



ADJUSTED (R V S) 
for Logrank 



375 
181 
199 
179 
88 
93 



184 
112 
83 
101 
51 
61 



35.74 ) 

) 

25.26 ) 

1 1 .66 ) 

) 

8.34 ) 

1 15 .21 ) 

) 

100.79 ) 



210.02 ) 
) 

185.98 ) 

102.54 ) 

) 

81 .46 ) 

58.19 ) 

) 

53.81 ) 



3.17 



0.09 



6.21 



9.05 



1 .0752 



1 .7618 



1 .0127 



.0021 



10.96 10.84 1 .0007 .0009 



8.56 



1 .90 



1 .0034 



1 .1683 



.uxour size 



ADJUSTED (R V S) 
for Logrank 



56 
397 
107 



17 
213 
65 



35.76 ) 
) 

208. 1 1 ) 

) 

51.13 ) 



9.94 



1 .0016 



13.82 10.62 2 .0010 .0047 



Table Continued. 
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Tumour No. of 

size (contd) Cases 



T1R 
T1S 
T2R 
T2S 
T3R 
T3S 



51 
32 
53 
51R 

SIS 



37 
19 
1S8 
199 
53 
54 



No. of 
Deaths 



13 
4 
40 
123 
32 
33 



Expected 
No. of 
Deaths 



10.96 ) 



6.04 



115.88 ) 



ADJUSTED (R V S) 
for Logrank 



307 
141 
112 
164 
143 
67 
74 
57 
55 



147 
79 
69 
65 
82 
35 
44 
35 
34 



ADJUSTED (R V S) 
for Logrank 



97.12 
33.84 
32.16 



171 .76) 

) 

69.91) 

) 

53.33) 

85. 15) 
) 

61 .84) 
39.65) 



LR 



) 1 .08 



df P T „ P 
LR w 



1 .2981 



) 12.76 



0.04 



8.4 



1 .0004 



1 .8343 



1 .0038 



9.41 10.97 2 .0090 .0067 



11.41 



) 1.11 
39.33) 

35.67) 

) 0.03 
33.33) 



1 .0007 



1 .2914 



1 .8714 



31 14 17.51) 

) 

- 168 66 96.40) 

) 18.65 23.70 3 .0003 .0000 
--: 174 96 88.75) 

) 

• - 188 120 93.34) 

- 20 8 9.23) 

) .48 1 .4363 

— . = 11 6 4.77) 



Table continued 
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Age (contd) 

40-50R 

40-50S 

50-60R 

50-60S 

60+R 

60+S 



No. of 
Cases 



89 
79 
85 
89 
94 
94 



No. of 
Deaths 



35 
31 
39 
57 
53 
67 



ADJUSTED (R V S) 
for Logrank 



Expected 
No. of 
Deaths 

35.65 ) 
) 

30.35 ) 
50.64 ) 



LR 



.03 



) 5.70 



45.36 



67.07 ) 



) 6.78 



52.93 ) 



X df P Tn P 
w LR w 



1 .8721 



10.52 



1 .0170 



1 .0092 



1 .0012 



Table (2.5.2) 
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A point to note regarding the tests in the table (2.5.2) 
is that we intend to compare the logrank test with the Wilcoxon test. 
The most noticeable source of discrepancy if any in terms of 
magnitude will be detectable in the study of the actual prognostic 
indicators rather than treatment comparisons. For this reason a 
comparison of the two tests based on prognostic differences will 
suffice. A consideration of the results of table (2.5.2) indicates 
that the radical surgery without radiotherapy is producing longer 
survival times than the simple surgery with radiotherapy. 

The prognostic factors indicate that stage one, two and 
three are respectively ordered in terms of their progress of the 
disease and the later risks of development of the disease. The 
stage one group produce a treatment difference that is much greater 
than stage two and three tumours. This is an indication that the 
actual value of the stage may be interacting with treatment. It 
is not possible now to discuss this point further or substantiate 
with a formal test. In Chapter 6 we will do so. 

The indication of different values of the treatment effects 
appears for some other prognostic indicators. Menopausal status and 

indicate that post menopausal patients and for age over 50' s 
group, the treatment differences are at their heighest. This, as 

pointed out earlier, may be due to sample size rather than the 
treatment effects on prognostic strata. The number of patients 
t.-.e menopausal group are rather low and thus with the present 
- --.od we will not study the effect of treatment status further. 

The/ 
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The adjusted rates on logrank statistics for each of the 
prognostic indicators carry the same message as the unadjusted rates. 
That is we detect a better survival rate for the radical surgery group. 



In the case of age, the subgroups contain a reasonable 
number of cases in each category and a statement in a descriptive 
manner may be made regarding interaction between age and treatment. 
The survival rates for the 50-60 group give a significance level for 
the treatment difference of 0.0170 and for the 60+ group a level 
of 0.0092. The younger patients give probability levels that are 
not significant in terms of treatment differences. This effect is 
more notable for the 40-50 group. It must be noted however that 
this apparent difference is not a statistical indication of a difference 
in treatment effectiveness for the different age groups. Such formal 
tests will be performed in Chapters 6 and 7. 

On considering the hazard rates and the corresponding logrank 
tests, there is an indication that the treatment effects are in a 
similar direction for all prognostic subgroups. However it must 
oe pointed out that in terms of extent of the risks on the time 
scale they are not always similar. Figures (2.5.5) and (2.5.6) 
together with the corresponding logrank tests suggest that older 
patients produce a higher failure rate when they are treated with 
simple mastectomy and radiotherapy. Further there seems to be an 
indication that risks are reduced for the 50+ group 7 years after 
-r=atment, while risks remain the same for the rest. Figures 

-.;.7) and (2.5.8) with the corresponding logrank tests suggest 
~~ similar pattern for the menopausal status, which conforms to the 
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age interpretations. Both menopausal and premenopausal groups produce 
higher initial hazard rates than the postmenopausal groups but the 
rates later converge. In Chapter 6 we will perform tests on such 
time dependencies. 

For the size of the tumour, there is a slight indication that 
hazard rates are of similar pattern for all groups during most of 
the time scrle. However, the larger tumours after an initial period 
of constant risx produce lower levels in later stages of the disease. 
The main purpose for putting this emphasis on time dependency of 
size and age is to relate the findings to logrank and Wilcoxon tests. 
The above points are similarly noted for the differences between the 
two tests. Wilcoxon test for the categories of age survival gives 
a chi-squared value of 23.7 against logrank value of 18.05. The 
reverse is true for the size of the tumour. That is the Wilcoxon 
test gives a chi-squared value of 10.62 and the logrank test gives 
a higher value of 13.82, indicating that the differences may be due 
to later events. The two tests do not differ to an important degree 
and other main effect prognostic categories show even lower 
differences. In fact the difference between the two tests of size 
may be coincidental. Inferences about variables of prognostic 
importance from this study will be considered again in Chapters 6 
and 7. In these chapters a more detailed model will be used and 
indirectly we will explain some of the differences between the 
logrank and the Wilcoxon tests. 



CHAPTER 3 



PARAMETRIC METHODS ftND HAZARD FUNCTIONS 

In the previous chapter we discussed a set of non- 
parametric statistical methods for the analysis of survival data. 
In this chapter we will be dealing with the parametric methods. 
Within the descriptions of this chapter we will discuss a few 
possible hazard functions from empirical data. 

3.1 Comm only used parametric methods in survival analysis . 

These methods follow the general philosophy of parametric 
statistics by which we assume that time to a critical event is a 
random variable and based on this postulate we may assign a 
frequency distribution to the survival times. Basically the 
distribution functions must be able to approximate to the empirical 
life-tables which present the cumulative proportion of cases 
surviving against the time scale of events. It is often difficult 
to visualise differences between classes of survival functions or 
identify them purely based on an inspection of the distribution 
function, in that the survival distribution is always a decreasing 
function. However for purposes of defining and classifying between 
distribution functions, their transformation to hazard functions 
plays an important role, so that by a visual display of such 
functions a pattern of events can be observed. The hazard rates 
in fact present the rate of change of the survival curves and thus 
the/ 



the pattern of the hazard can be useful for the purpose of identif- 
ication between the empirical and frequency distributions. In the 
early stages of the chapter we are not interested in the effects of 
treatments or prognostic variates, but rather, in the possible 
families of distribution functions that may be useful in the 
applications of fitting parametric distribution functions to 
life-tables . 



First we describe 3 rather general methods and the plots of 
their hazard functions. 

Name of Hazard Rate Death Density Function Survivorship 

Distribution Function 

Exponential Aft) = X f(t) = Xexp(-Xt) ,X>0j t>0 S ( t) =exp (-X t) 

Weiball V(t) = uut u " 1 f(t) = uut u " l exp(-ut u ) lJ >0 S (t) =exp(- ut u ) 



U>0; t>0 

Ra^leigh X(t)=X +2X 1 t f ( t) = ( X Q +X 1 t) exp (-X Q t-X 1 t 2 ) S ( t) =exp <-X t 

x >o,x 1 >o,t>o Xl t 2 ) 

The first two of the above distribution are in fact members of 
the same set and the final distribution will be referred to as 
a special case of Taulbee's approach Later in this chapter. 
Figures (3.1.1) to (3.1.9) illustrate the various functions for the 
three distributions at variable parameter values. 

In the previous chapter we presented some of the empirical 
hazard rates for the old Edinburgh trial data, in specific subgroups 
of patients. It is important to note that at this stage we mention 
hazards in general terms for the total population. In practice 
hazard rates can show different quantitative failure rates for 
different subgroups. Under a parametric context for a comparative 
study/ 
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study of the survival rates we are interested in the magnitude of 
a parameter that can best describe the differences between the 
various subgroups of patients. Later in this chapter we will 
derive the necessary estimates of the above distributions with 
covariate effects present. In applying such methods we make certain 
assumptions on the actual form of the hazard rates in choosing a 
particular model for describing the relevant differences. By a 
visual inspection of the hazard rates one can then judge how well the 
data conforms to the assumptions of the statistical method. 



3.2 Examples of hazard fun c tions and families of dis tribution =or 
survival analysis . — 



The most common parametric distributions used in clinical 
trials for the survival of patients are the exponential and the 
Weiball distribution. The Weiball offers a wide range of increasing 
and decreasing hazard rates, with the exponential function being 
a special case for a constant hazard model. As will be shown, these 
two distributions belong to a family of proportional hazard models 
with covariates. The assumption of proportional hazards requires 
that the hazard rates for all subgroups must be a multiple of a base 
time hazard rate for the entire set of subgroups. S. Gore (1981) 
Figure (3.2.1) shows the example of breast cancer trial data in which 
the assumptions of proportional hazards are violated. Without a 
visual inspection of the hazard functions, there is a strong 
temptation to use one of the robust proportional hazard models, such 
as that described in Cox's (1972) paper. An inspection of the 
hazard function can also lead to choosing a more efficient analysis 
based/ 



79 




based on parametric methods, while an approach purely based on tests 
of significance using completely non-parametric methods may over- 
generalise the pattern of failure. 



In the plots of exponential, Weiball and Ragleigh distribu- 
tion some forms of constant increasing and decreasing failure rates 
were presented. Later we will discuss some u-shaped and cone-shaped 
hazard rates that can arise from a trial data. 

Turner et al (1976) consider a general 3 parameter family 
of survival distributions. This family is able to generate an 
extensive number of distributions that can be used in a survival 
analysis. The general survival function is given by 

For t £0, S>0, p >0 
and -°>< n < 03 
The probability density function is 

" f(t) = Hi [ S(t) (1-np)/d + P) . S(t) (^n)/( 1+P ) ] }1+P (3<2 _ 2) 

These functions provide a set of highly flexible distributions with 
many differing shapes for the hazard functions, such as increasing, 
decreasing, constant and cone shaped hazards. This variability 
of the hazard rates can be mimicked by the range of the distributions 
of form, Gamma, Weiball, Lognormal , Ragleigh, Single hit and Arheus 
distributions. The most important advantage in the use of Turner's 
family of distributions is that all these distributions can be 
defined/ 



defined by only 3 parameters. This offers a formal test for 
comparison of shapes of hazards. In a related paper by Bertanou 
et al (1978) the Turner's family of distributions with concomitant 
variables is used. They use a maximum likelihood estimator for 
the parameters using a method of Hazelxig et al (1978) , The most 
important problem in the general use of this approach in survival 
analysis so far has been that of adopting an estimation procedure 
capable of dealing with the complications of the censored survival 
data. 

Bertanou et al (1978) compare life expectancy in two 
groups of children treated without surgery in Tetralogy of Fallot. 
The data is not based on a randomised trial, but is formed of clinical 
information and autopsy data. In this approach the analysis begins 
with the study of the possibility of detecting changing risk patterns 
among subgroups. Further, a comparison of estimates within each 
group can easily be made without making the restrictive assumptions 
that the two subgroups have . similarly shaped survival distributions. 
It may well be expected that this approach by being a parametric one 
provides a better approximation to the hazard functions. The result 
is an effective procedure for estimating parameters of the distribut- 
ion. In their conclusion Bertanou et al (1978) support the generally 
accepted view that "the natural history of person born with 
tetralogy of Fallot is determined primarily by the severity of the 
pulmonary sterosis, as demonstrated by the tendency of the person 
with pulmonary uteria to die at a young age than those without 
pulmonary uteria or the group as a whole". 

Using/ 
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Using this parametric approach, the conslusions 
regarding the results, both in terms of survival times of treatment 
main grcupsc.and the subgroups are the same as the alternative non- 
parametric or single parametric approaches. However, within the 
present setting they provide the corresponding hazard functions for 
the different groups, Figure (3.2.2). 

It is clear that the highest risk period for pulmonary 
steriosis is the first two years and unlike pulmonary uterisia, the 
risk of death does not decline in the later years, probably due to 
relatively high risks in the second decade. 

With the parametric estimation of the hazard rates clear 
cut functions are produced that are intelligible in reducing the 
data on the timescale. An empirical plot would yield the same 
patterns and the same information. However the Turner's generic 
family of survival curves has the distinct advantage that by 
inclusion of extra parameters, there is a possibility of testing the 
hazard functions and obtaining a distribution from its hierarchy that 
yields the best parsimonious fit. The difficulty with this approach 
can be the interpretation of the results if the estimating parameters 
of the hazards differ greatly. Since there are 3 parameters it will 
be difficult to decide on the meaning of such patterns. An extreme 
example is a situation where one group has a higher initial hazard 
rate followed by a constant hazard rate and another group having 
initially a low hazard rate followed by a high rate. The outcome 
can be two survival curves that cross each other somewhere in the 
time/ 
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time scale. In the extension of the Weiball model with covariates 
to a proportional hazards model of the next section we will discuss 
these points in less extreme situations. 

With a non-parametric approach tests can also be 
constructed to assess hazards. However if a parametric approach 
is justified, there will be a loss in efficiency in adopting a non- 
parametric method. It must also be emphasized that in practice the 
Turner's generic family may be too generous in providing a range of 
distributions, where the main aim is to assess effects of treatments 
and covariates. 

Barlow et al (1978) adopt a more confined approach in 
classification of survival distributions. In their terminology, they 
adopt a failure rate rather than hazard rates. Three classes of 
distributions are defined by their terminology, (a) increasing 
failure rates, (b) decreasing failure rates, (c) u-shaper failure 
rates. The Turner's family also includes a cone shaped hazard which 
belongs to the Arhenous distribution, as was mentioned earlier in this 
section. 

An example of an increasing failure rate would be a healthy 
population of over 50 years of age. In such a group one would expect 
that the effects of old age will become increasingly dominant and 
hence with increasing age the number of deaths will increase. A 
possible hazard curve for such a population is the Weiball distribution 
with shape parameter p = 1:.5 Figures (3.2.3) and (3.2.4) represent 
the / 
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The time immediately after a major operation is a 
critical period for the patients. Often the patient is recovering 
from anaesthetics, which add extra risk to the survival of the 
patients. However if there is no progression of disease and the 
population is young enough not to be affected by old age a possible 
survival distribution would have a relatively high rate of death 
in the beginning of the time scale. With the passage of time the 
normal functions of body can take over and the survival rates could 
decrease and conform to a healthy population. A Weiball distribution 
with the parameter p = 0.5 can be a possible distribution to approxi- 
mate such a population. Figures (3.2.5) and (3.2.6) represent the 
corresponding hazard and survivor function. 

Had we not taken the above assumptions, regarding the age 
of patients, then the effects of old age become increasingly dominant 
in our population. Further assuming that after the operation there 
is still some possibility of the progress of disease, as the case may 
be in a population of post-menopausal Stage I and II brease cancer 
patients treated by mastectomy and operative radiotherapy, then the 
hazard rate will be composed of a declining hazard rate followed by 
an increasing hazard. In fact the life table of all ages of 
population of a country snows such a hazard rage. At birth the 
newly born experiences the highest risk of illness and death; with 
development and growth of the child the risks decrease until later in 
life new risks of death develop due to old age. Figure (3.2.3) 
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The u-shaped pattern of failure, Figure (3.2.7) can be 
interpretated as suggesting two forms of failure. One due to risks 
of early life and birth and the other due to old age. In a 
clinical trial situation if such a pattern is apparent, the constant 
hazard period in the middle tends to be much shorter. This gives 
rise to one of the important methodological problems in clinical 
trials; that is defining the relevant causes of death. 

In the previous example on breast cancer treatment, three 
processes were taking place, each of which can contribute to death. 
The first factor is the side effects of the treatment, that is 
mastectomy and radiotherapy in the initial period. Secondly there 
are risks due to the general progress of the disease either locally 
or due to metastatic disease and finally there is death due to old age. 

The next example of a hazard function we consider is a 
cone shaped hazard. The u-shaped hazard was a combination of a 
decreasing hazard rate followed by an increasing hazard rate. The 
cone shaped hazard is the reverse of this. It begins with an 
increasing failure rate, reaches a peak and then falls. Time to 
development of metastatic disease in cancer patients can have a cone 
shaped hazard. 

In the early stages, the disease is confirned to local areas 
and hence there is a low probaoility of development of metastatic 
disease, depending on the form of cancer. With the passage of time 
the chances of developing metastatic disease increases. For operable 
breast/ 



breast cancer patients this peak may be reached within 5 years of the 
detection of the disease. In such a group of patients, there will be 
some patients with a better prognosis who will not develop metastatic 
disease. These patients can be increasingly distinguished from the 
rest who have developed metastatic disease by passage of time. If 
a patient has not developed metastatic disease in the first five years, 
the chances of devloping metastatic disease diminishes in the sub- 
sequent years. For this reason the peak in the 5th year hazard 
rate should begin to fall. Figures (3.2.8) and (3.2.9) relate to 
the hazard rates and survival functions of the above discussion. 

Prout, Slack and Bross (1973) discuss a rather interesting 
population of invasive bladder cancer patients. Criterion for entry 
into the trial is that patients must have non-invasive bladder cancer, 
out also must pass a test indicating that there is no metastatic 
disease present. After 10 years of follow-up the hazard rate is 
observed. The hazard curves show two separate peaks for the 
population. Proust et al consider the reason for the appearance of 
two cones to be due to the population being composed of two very 
different prognostic groups. This effect is also later indicated 
by biological evidence. One major entry criterion is a negative 
result on metastatic disease test. Patients who enter the trial 
must have shown a negative result with the test. However a group of 
patients who do not show any evidence of metastatic disease and are 
test negative are in fact metastatic patients who have not been detected 
by the test. The first peak of the hazard is due to these patients. 
The second peak is due to the rest of the population, who are non- 
metastatic at the time of entry, but develop metastatic disease 
later/ 
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later in the course of the progress of disease. Figures (3 . 2 . 1 0) 
and (3.2.11) present the hazard and survivorship functions for this 
population. 



Finally in a paper, L.E. Rutquist et al (1982) set out 
to answer the question "is breast cancer a curable disease?" They 
consider cure to be synonymous with a pattern of survival rates 
conforming to survival rates of a normal healthy population. For 
reasons of comparison they note that there exists two different 
mortality rates, one due to the uncured cases assumed to be constant 
over time and the other for the cured patients subject to risks of 
a normal healthy population. Therefore they assume a two parameter 
model representing sums of two exponential models as appropriate. 
Further they consider a log normal distribution giving a low initial 
mortality which rapidly increases to a maximum and with a slow 
decrease in mortality after the maximum has occured. In their 
conclusion it is noted that excess mortality from breast cancer is 
noted at least 18 years after treatment. 

One point to note in the above study as well as in some 
of the previous methods is that for purposes of inference they adopt 
a X test of goodness of fit for the comparison of the expected and 
observed values of the survival distributions. Another commonly used 
method for the estimation of relevant parameters is the maximum 
likelihood method. We will discuss this approach in more detail 
within the discussions of the covariates. 

In the above discussions much importance was attached to 
the shape of the hazard rates. Examples of empirical data were 
discussed/ 
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discussed and some parametric distributions were mentioned that 
can approximate the distributions. The shape of the hazard may 
give useful information as far as the biological nature of the 
progress of disease is concerned. It can also introduce tests of 
significance. For example we could test the null hypothesis of a 
constant hazard (exponentially distributed density function) against 
the alternative with increasing or decreasing hazards (Weiball density 
functions) . In the next section we will discuss parametric 
distributions purely for the purpose of testing the effects of 
treatments and other concomitant variables, by the use of covariates. 

3.3 Inclusion of Covariates. 

Once a decision is made on the shape of the hazards that 

may be fitted to the population, an additional function may be 
combined with the hazard function to form a hazard function for a 

specific sub-population. This additional information is related to 

the extra function and is refered to as the concomitant information. 
Examples of concomitant, information are indicators for the treatment 
effects, age of patient at entry, stage of disease, size of tumour 
and other prognostic indicators. These additional sets can be 
used either singly or in combination to estimate parameters so that 
a distinct survival distribution may be fitted to each subgroup. 
The estimated value of such parameters will be used -co assess the 
significance of the survival differences between two or more subgroups. 

In the above discussion we have made a necessary distinction 
between the hazard functions and the concomitant variables. The 
former provides information on the rate of failure of the patients while 
the / 



the latter defines subgroups of patients. The distinction may be 
more complex and difficult at times in deciding which parameter is 
appropriate for assessment and comparison of the subgroup. Using 
the extreme example of crossing survival curves, an interpretation of 
the parameter estimates can depend to some extent on the weighting 
attached to the various points in time. However it is not a problem 
that one often encounters in practice. We proceed now with the 
development and representation of parametric statistical methods 
which are useful in clinical trials. 

For each case entered into the trial, in addition to 
failure time or censoring time t.and the indicator variable 5. 
(o for censored, 1 for uncensored response), there exists a vector 
Z i ( Z U " * * ' Z ri' of covariate indicators or explanatory 

variable indicators. Then according to the previous definitions 
f the hazard rates for each subgroup we can represent the hazard rate 



o 
as 



(Hazard at time t, for subgroup k) = (General hazard at time t) 

(Function of variable indicator 
for subgroup k) 

In the simple case of the exponential distribution with the general 
hazard rate X we can write the above as 

Mt 1 2.) = x ° EXP (3' Z.J 

where EXP (6 ' z.) is a mathematically convenient function for 
representing multiplicative effects of indicator variables. 6 is 
a vector to be estimated and represents a set of coefficients 
associated with the covariates and is used for the testing of 
prognostic/ 



prognostic effects indicators. One point to note at this stage is 
that our formulations need not be as restrictive as the above 
formulation. Later we will discuss a group of models that are 
based on the following formulation. 

(Hazard at time t, for subgroup k) = (General hazard at time t 

for subgroup k)° (Function 
of variable indicators for 
subgroup k) 

The former models are in general named as proportional hazard model 
and an example of the latter model is the accelerated failure time 
model . 

The proportional hazard models are expressed as - 
X (t, Z.) = \( t ) exp (S 2.) 

where ^t) is a function of time referring to a base line hazard 
rate. In the case of the exponential it is not dependent on time 
and in the case of the Weiball it is expressed as X Q (t) = U u t u_1 ; 
where p and uare scale and shape parameters. 

One point in introducing the concept of covariates is that 
it enables us to compare different treatments for a single disease. 
Further it is possible to identify auxiliary factors that influence 
survival times. The use of concomitant information is an approach 
for identifying the factors that are associate with the survival 
times in relative terms. This latter emphasis is different to the 
discussion of earlier parts of this chapter on parametric methods, 
which dealt with parametric estimation of survival times and a 
possible interpretation based on the functional form of the 
parametric/ 



parametric models. The procedure commonly used for the estimation 
and testing of:the effects of the covariates is termed as the 
maximum likelihood estimation and is dealt with by S.D. silvey (,975) 
Basically, to assess the effects of factors influencing the survival 
times we require a function that can express the survival experience 



of all cases. 
Thus 



likelihood function = fl rufrsHi ^ * 

u ( -Livelihood of survival 

all experience of a case) 

patients 

Further we can distinguish censored" cases and responding cases, 
and thus we write; 

likelihood function = II /^„-i.u ^ 

deaths (death density | (survival 
function) allves function) 

By the definitions of the hazard functions, survival functions and 
density functions we can write the above equivalent^ as, 

likelihood function = n (ha2ard function) n (survival 

deaths all function) 

Each of the above function in brackets can have a mathematical 
formulation, based on insight into the distributional form of the 
data. Further each of these formulations may be defined by a set 
of parameters. Our intention is to use a procedure to estimate 
the best values of parameters that can explain the survival 
experience of the population with the least number of parameters 
and with an acceptably low difference between estimated, expected and 
actual survival times. Later in this chapter we will develop the 
above formulation for the exponential, Weiball and Taulbee approach. 
Also based on the distributions we will define general families of 
functions. 
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When parametric methods are used in conjunction with 
covariate effects more care is needed for identifying the correct 
functions. What is crucial in survival analysis as in any branch of 
applied statistics is obtaining a reasonable fit to the data. A 
commonly used indicator of a good model for the data is the pattern 
of the residuals, where the residuals are defined to be a function of 
the difference between predicted and observed values. Further a plot 

of the data can at times indicate whether the theoretical model's range 

can fall within the variability of the data. 

An example is the situation where it is assumed that there 
exists a constant hazard rate for a population. Therefore the best 
model to fit is conjectured to be the exponential distribution, which 
has a constant hazard rate. Once the data is fitted and the values 
of the residuals are compared, possible short comings of the model may 
become apparent. if the conjecture is substantiated by the data, 
then the outcome would be a set of residuals that follow a constant 
pattern through time. m situations that the hazard rate is 
increasing or decreasing a similar pattern will be reflected by the 
residuals. 

In the formulation of the likelihood with the proportional 
hazard assumtion the hazard rate is assumed to be dependent on the 
covariates only through the EXP ( 6Z > function. It is however possible 
that in some situations with the passage of time the effects of co- 
variates may change. One manner in which we can test the time 
dependency assumtions of covariates in a proportional hazard model 
is by formulation of a likelihood such as; 
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likelihood function = n (hazard H (survival 

death function) all function) 
cases cases 

II (time depencency 
all function) 
cases 

(3.3.1; 



In the following sections of this chapter we will study 
in detail different methods of the estimation of covariate effects 
for different shapes of hazard rates. Before doing so we will remark 
on the various, advantages of the approaches that have been discussed 
so far. 

Earlier we mentioned Turner's family of distributions 
as a flexible multi-parametric method by which it is possible to 
obtain a close fit to the subgroups of the data as a method for 
data reduction. For reasons of comparison between subgroups, there 
may be situations where it is sufficient to use a multi-parametric 
method for a base line hazard rate together with a simple single 
parameter relative risk. Clearly estimation of a large number, of 
nuisance parameters is an important consideration in such a study. 
Alternatively in other situations a multi-parametric method may be 
used with a multi-parameter relative risk for each subgroup. This 
approach has the disadvantage that the interpretation of the 
inference of the subgroup may not be easy. The likelihood 
function (3.3.1) has a major advantage in that the interpretation of 
the covariate effects is much simpler than when separate distribut- 
ions are fitted to different subgroups. 

In terms of the survival curves the above formulation of the 
proportional hazards may be interpreted as follows. Between the uange 
of all/ 
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of all possible survival curves for the set of prognostic and 
treatment groups, there exists a base hazard rate. All other 
survival curves can further be generated after multiplying the 
base hazard rate by the corresponding function of EXP (Z 6) , which 
is a scaler for each subgroup. If the proportional hazard assumption 
holds the final term in (3.3.1) contributes nothing to the covariate 
effects. If alternatively the hazards behave as non proportional 
rates then a time dependent functional form of Z(t) must be used 
instead of Z. 

3.4 Polynomial Hazard Rates. 



Taulbee (1979) discusses a generalised form of the 
Ragleith distribution, in which the hazard has a polynomial pattern. 



X ft) = X + X t + A. t Z + . . .X t m Z X t" „ . , 

o 1 2 m = k=Q (3.4.1) 



where m refers to the degree of the polynomial. 



In the presence of covariate effects Z_. for j = 1 , . . .S 
we may then adopt a substitution for X^ such as X^ exp( B U Z.) giving 

X k (t,Z.) ■ X k t k exp ( B|< Z.) 



for k = , 1 . . . . m 

Wiaere in the above definitions we have considered B to be a parameter 

K 

set to be estimated for each of k = to m. Further for each B 

k 

there exists a representation B = (§.,,.., 6 ) and for 

K i ks 

the vector Z ther is a representation Z. = (Z., . . . .z. ). 

1 11 is 

Where/ 
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Where s is number of covariates and i refers to a particular case. 

EXP m k Z.) is numerically the most convenient function 
although in general we can express the above as 

\ (t ' Z i ) = X k ^ h(Z. f B|{ ) (3.4.2) 
and let h ( 3 k , Z.) be for example EXP ( B Z.J, (1+8 Z.) or 

d+ B k z.) 1 . m here we adQpt a general definition Qf B However 

K 

later we will adopt a restricted f orm of b where B = B = r 

k 1 °2' 

giving a proportional hazard type of the model. 

In analogous manner to that of other parametric models such 
as the Weiball, it seems necessary that a good prior knowledge is 
required for use of any particular hazard shape. Further the functional 
form of h(B k , z .> is related to the derivation of the functional form 
of the subgroup hazard rates from that of the base line hazard, 
\ (t,0). in particular this relation is important for the 
proportional hazard restriction form of the model. We will discuss 
these points later with the use of a particular form of Ragleigh 
distribution with increasing hazards, that is \ Q > and ^ >0. 
By substituting (3.4.2) and expanding (3.4.1) we write the general 
hazard function as 



k m (t 'V = x o h(z i- V + *i h( V B i )fc + • • • + x m h <V b )t m 



m 

Z X k h(Z i' B ^ ,tk 
k=o K 1 k 
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Using the definitions from the introduction we have 



S(t) = EXP [ - 



X(u)du] 





giving 

rt m 



s rp » t , ?. . i = zxi?i - : - . ... ... . < 



m " 2 x k h (Z. , B )u du] 

J k=0 K 1 k 



ra t 
* xp f - z x h (z b ){ u*du} ] 
k=0 -'o 



m X 

= EXP [ " { k fo ^T h(Z i' V fck+1} ] 



Further using usual approaches in the construction of the likelihoods 
we have 



Likelihood = n (Hazard function) n (Survivor function) 
A11 All 
deaths cases 



n x (t, zj n 

deaths m 1 ail S m (t 'V 

(i) 



m k m a 



^ (i) 
Then the likelihood function L , for a 2 degree hazard is given by 
(we eliminate subscript i, for the moment for brevity). 

L = M A o h(Z, B o , + X 1 hCZ t 8| )t + X 2 h(Z, B 2 )t 2 ] 

' HEXP M X Qh (Z,B o)t+ V (Z ' V fc2 + l2^_V^ 

A o o 2 5 J 



■ 



We now reinsert subscript i 
giving 



& = In L = Z 6. ln(F 1.) + F 2 

All 1 1 1 

(i) 

where, 

F1 i = U h (Z i' B ) + X 1 h(Z i' V* + A 2 h(Z., B 2 )t 2 ] 



X 1 h(Z i ' B i )fc2 * h(Z. , BJh 3 

F2. = [- { X h(Z., B() ,t + 1 J 1 + 2 2 r 



Now we do a differentiation of the necessary parameters for maximum 
likelihood estimates 



o 



z 

All 
(i) 



h(Z., 
F1 



V 



(-h B Q ) t) 



H 
3A 



2 6 . 
All 1 
(i) 



B,)t 



F1 . 
1 



+ (- 



h(Z if B^t' 



3 -i 

ax 



All 
(i) 



h(Z., B 2 )t' 

FT 

1 



(- 



h(Z., B 2 )f 



M. = 

3fl 

Oj 



All 

(i) 



F1 



3h 



(Z, 



B. 



(- X Q t) 



' B 



°3 



32, 



3 8. 



'Ij 



2 

All 

(i) 



6 . 
1 



F1 . ' 
1 



3h (Z ij- B li } 



-A. 



3B 



(^t) + 



3B 



ij 



2 3 All ^'i oB 2 j 2 3 3B 2j 
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Where 6. = 1 for deaths and 6. = for censored cases. 

oj' B ij' B 2j are covar iates for each degree of the polynomial. 

(In this case a 2 degree polynomial.) 

- 

X o' X 1 ' A 2 refer to the hazards polynomial, 

and i is a subscript for each case and j is the number of covariate 
effect under test. 



In the above formulations we have allowed b. to vary 
depending on the degree of the polynomial that approximates the 
hazard rate. This generality is violating the proportional hazards 
assumption. A restriction such as B. =B for all k, converts the 
approach to a proportional hazards version. in such a situation the 
hazard is, 

m k 

\ < t, Z ) = ( Z X t ) ° h( Z.,B ) 

k=0 K 1 

giving for a 2 degree hazard 

x 2 (t, Z.) = ( A Q + * it + x 2 t 2 ) Exp (b Z.) 
Further .X^t, Z.) using the above generality is an example of 
the Rayleigh distribution hazard rate, with covariates 



'1 (t ' V = (X Q + ^ tJ ° Exp ( BZ i ) 



forA Q >0 and X 1 >0 



We proceed with this approach in an analysis of the Edinburgh trial 
data/ 
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data, using maximum likelihood estimation. For our particular use 
we adopt a method of maximising the likelihood function using the 
P3R programme of the BMDP for estimation of non-linear regression 
models by the Newton-Raphson procedures. This programme is a 
flexible enough procedure for the estimation of the relevant 
parameters of the above named functions. 

The programme requests the actual likelihood function, 
the derivaatives with respect to the estimating parameters and 
a loss function. In the last section we produced the necessary 
functions and derivatives for a general Taulbee approach. Initially 
we analyse the data for a linear hazard model with the proportional 
hazard assumption. In this analysis we use the treatment option 
given by Radiotherapy and simple surgery against radical surgery 
as the main effect of the study. Later with use of the other 
covariates we approach the analysis with the Weiball and the 
exponential models. Throughout we use a survival time scale in 
months . 



First we fit a model with a zero rate hazard. This is 
equivalent to an exponential model with the proportional hazard 
assumption. With m = we have 

X(t, Z) = X Q o Exp( BZ) 

L = H A Q Exp( BZ) a n Exp [- X Q Exp( BZ) t ] 

giving the estimated parameters 

X Q = -0041 S.E. = 0.00309 ) 

) df = 559 InL = -1741 .82 
B m 1.4821 S.E. = 0.4021 ) 
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Now we can expand the model by allowing the hazard to have a straight 
line passing through the origin. If we set * = 0, then hazard is 



Mt, Z) = ( X. t) . Exp(6 Z) 

The present model is not suitable for the purpose of analysis in 
that we have introduced two types of restriction, one indicating 

X Q = and the other assessing the proportionality of hazards. 
As a more suitable model we use the next memeber of this calss of 
distribution . 



We now fit a model of the hazard form that allows the 
straight line hazard not to pass through the orign. We therefore 
have to estimate both parameter X Q and \. simultaneously as well as 
the ^parameter for the covariates. 

We thus obtain the following estimated parameter for the 
model given by 

X(t, Z) = ( X Q + \ 1 t) . Exp(S Z) 



L = nX Q Exp (6 Z) + x^Exp ( 8Z)t. | Exp [- X Q Exp(6 Z) t + 



-) ] 





X T Exp (6 Z) t 2 



k Q = 0.008762 S.E. = 0.00186 ) 

) 

X 1 = 0.000438 S.E. = 0.00382 ) d.f. = 558, 

) In L = 1739.85 
8 = 1-4951 S.E. = 0.4037 ) 



The value of ^ is close to zero. In fact there is little improve- 
men t/ 



men* over the original Straight line model with \ Q parameter used as 
the hazards function only. Now, although the base line is approximated 
better and is less restricted, the estimator of the treatment effect 
is virtually unchanged. This indicates that the covariate effect 
part of the hazard namely Exp( 8 Z) is consistant if we can assume 
the proportionality of the hazards. 



The next model we consider, relaxes the proportional 
hazards assumption. This is a useful model for checking the 
proportionalities of linear type. Returning to the original derivat- 
ions of the model we can express the hazard rates of the next model 



as, 



X (t, 2) = \ Q Exp( 6 Q Z) + X 1 Exp( S 1 Z)t 



l - n x Q ( E xp(e Q z) + x 1 exp( 3l z)t . n ex P [- (\ q ex P , 



,Z) t + 



giving the estimator 



A Q = 0.009674 

A- 1 = 0.000511 

6 = 1.12 

8, = 1.4848 



A 1 Exp( 8 Z)t' 
2 



S.E. = 0.00257 ) 
) 

S.E. = 0.00376 ) d.f. = 557 

) 

S.E. = 1.311 ) In L = -1737.97 

) 

S.E. = 0.4121 ) 



The value of B Q is not si g nificant . a comparison of the log likeli- 
hood of this model with 557 degrees of freedom and the previous model 
with 558 degrees of freedom gives the difference of - 21n L = 3.76, 
which according to the chisquared distribution is not significant. 
We therefore do not reject the proportionality of hazards assumtion. 
By/ 



By the above models the linear structure of the hazard shapes may not 
allow an efficient estimation of the effects. This point is expressed 
more vividly when we deal with the Weiball models of the next section. 
In the discussions of the exponential model estimator it will be made 
clear that the actual value of \ Q is arbitrary in so far as the 
comparison of g's for different subgroups are concerned. 

So far in the study of the application of polynomial hazard rates, 
the above linear hazard rate has been the most appropriate. We will 
now estimate some of the subgroup covariate effects by this model and 
observe their contribution to a proper explanation of patient survival 
variation. The prognostic categories that are of particular interest 
now and which will be discussed in more detail later are, menopausal 
status, initial size of the tumour and node histology status. Later 
•in chapter 6 we will define the indicators in greater detail. In here 
we will only use them for the purpose of illustration. 

We fit covariate effect models to the data for each of the 
above main effects in presence of the treatment effects. Consistantly 
we note that there is a reduction in treatment effect of S estimator 
and a comparison of tie covariate functions does not show any important 
difference in the treatment effect estimators. This indicates that 
the treatment effect is stable for the different prognostic groups. 

Model with treatment and node, 

treatment = K4521 S - E - = °- 407 2 

B node = 1.2182 S.E. = 0.6410 

Model/ 



Model with treatment a.id size, 

"treatment = , ' 4486 S.E. = 0.4513 

8 size = S.E. = 0.6834 



Model with treatment and menopausal status, where we consider 
post menopausal and menopausal group as one category and 
pre-menopausal as another. 



treatment 



menopausal 



= 1.2730 



= 1.3051 



S.E. = 0.4481 
S.E. = 0.5913 



. 3.5 Exponential distribution for censored survi^i , data with gggrigtes 

The exponential distribution has been used extensively as a basis 
for study of survival distributions. The simplicity of this model has 
been the main reason for its common usage. However, at times it has 
been used in situations where the assumption of constant hazards has been 
violated. The model is simple to estimate and has only one parameter 
for defining the failure rate which is not dependent on time. Thus in 
this model the risk of death is independent of time. 



in an early demonstration of the exponential survival distribution 
Boag (1949) applied the distribution to the survival of cancer patients. 
David (1952, examined the distribution in relation to the field of 
reliability and applied the method to 26 mechanical survival situations. 
Several authors, Halperin (1952) and Epstein and Sobel (1953, 1954) 
investigated the maximum likelihood estimation of the one parameter case, 
A., for censored data. Fie gle and Zclen ( 1965) investigated the problem 
of / 
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of estimation of constant linear hazards X with covariates. The 
exponential distribution at times has been termed as the "memory less 
distribution" since the hazard rate is not a function of time. 

The median of the distribution is In (2) / X mean is 1/X and 
the variance is 1/ X2. Where X is interpreted as the force of 
mortality. The larger the value of * , the shorter Is the mean life. 
The estimation for the uncensored case of the distribution is relatively 
simple. Using the maximum likelihood estimation there is a closed 
bound solution for the mean and variance. 

For a situation of random censorship, defined in the introduction 
to be the most common censoring in trials, we have the likelihood 



!• - n f(t.) n s(t.) 

Deaths (i) Censored (i) 1 



D Xe Xt i n e " Xt i 

Deaths (i) Censored (i) 

! (Xe " Xt i ) &i to" Xt i) 1 " 6 i 
i=1 

n 5 . 

n 

i=1 



n x 1 (e - Xt i } 



n* n 
X' Exp (-X Z t.) 

i=1 1 



* 

where n = number of uncensored Oi deaths 



giving 



* n 
In L = n ln(M-XSt. 

1=1 1 



* 

3 In L n 



3X X i 



n 

S t. = 



-> the maximum likelihood estimator of Xhas value 



« n 

X = I I t. *-i 



i=i W a 1 

The second derivative of the log likelihood with respeot to X yields 



2 

8 InL * 
~i — = -n 



ax* - -« / x * 



so that 



X -X 



/x 2 /n 1S approximately normally distributed as N(0,1) 

Using the asymptotic normality results on likelihoods. Further 
a transformation by the delta method gives 

X Wll (X , X 2 / ) 
n* 

The exponential distribution with only one parameter X is rather simple 
to obtain. The next stage of the development of the exponential 
distribution is to use covariates. The use of covariates with an 
exponential hazard rate may be developed with the maximum likelihood 
estimation and the Mewton-Raphson procedure. The reason for the use 
of the Newton Raphson procedure is that we often do not have a closed 
bound solution of the estimator. The interpretation of the results 
are also straight forward if the assumption of time independent hazards 
holds. in the above we obtain a method for the estimation of the 
hazard rate X . m the situation of analysis with covariates and use 
of maximum likelihood estimation, X in fact is not needed and it is 
possible to derive an inference for the covariates by setting x, the 
base hazard rate value to 1. In the section on the Weiball we will 
derive/ 
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derive functions for a maximum likelihood estimation of covariates and 
the Weiball shape parameter. 



The exponential with covariates is a special case of that 
procedure and will be discussed in more detail there. 



3.6 the Weiball distribution . 

The Weiball distribution was originally used by a Swedish 
physicist, Waladdi Weiball, who was interested in measuring the breaking 
strength of materials. The main reason for the initial interest on the 
Weiball distribution was that unlike the exponential it was able to 
fit the data, even when the breaking rate was not constant. Later 
A.C. Cohen produced maximum likelihood estimators for estimation of 
uncensored and censored cases. 



this chapter, it was shown how, with a shape parameter set to one the 
Weiball distribution reduces to the exponential. Apart from the 
situation with the shape parameter set to one, the hazard function in 
the Weiball is time dependent, and thus the rate of failure changes 
with the passage of time. 



The Weiball distribution is an extension of the exponential 



distribution. 



In the graphical presentation of the first section of 



we have; 



For the. distributional definitions of Weiball in section 3.1 

Vv 

the median of the Weiball is given by (In (2) /\i ) , 



the mean is given by 




1/ 



and the variance is 



u 
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Q 1 + 2/v) - ( ft 1 * ]/ vU 2 
2/v 



In here « is related to the hazard rate. The actual interpretation 
of the v for cases greater that one and less than one is the same as 
those for applications in which rate of death changes because of the 
underlying biological process. The estimation procedure is more complex 
than the exponential case. Neither in uncensored nor in censored models 
does there exist a closed maximum likelihood estimator. We now proceed 
with the derivation of the maximum likelihood estimator for a Weiball 
model with covariates. 



In the last section concomitant information was introduced into 
the likelihood function for the Taubles general model. According to the 
hazard functions, the subgroups affect the rate of events in terms of 
intensity with a relationship of exp (3 Z) . However, the covariate part 
of the model does not effect the shape of the base line hazard rates. 

The effect of the covariate on the Weiball hazard is repre- 
sented by 

v 

X(t , z) = vu(t) ~ 1 . e ZS 

The latter part Exp (Z re fers to covariates and is independent 
of time. This is basically a similar assumption as the one used in 
the last section to derive some of the results, for concomitant informat- 
ion . 

According to the relations in section 3.1 and the above 
hazard rates, the density function and survival function for the 
Weiball with covariates are 
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f(t, Z) = vu {t ) v-1 e Z 6 >Exp[ _ yt%ZB ^ 
S(t, Z) = M Exp [ - yt V e Z6 ] 

All the above expressions of the hazard rate, density function and the 
survival function can be considered as a generalisation of the 
exponential distribution with concomitant variables , simply by 
allowing v to be set to one. These results can serve as a general 

Purpose model for the different forms of the exponential and Weiball 

models, when the emphasis is on the estimation of the covariate effects. 

Using the formulations from previous sections, the likelihood function 

is - 

likelihood - J i mt V " 1 Ex P (3Z.) ] 6i E xp[(-t V J % j 



where as before s. = 1 for death and 

= for censorings 

and n is the total sample 

and n* is the number of deaths. 

" 6 i v -1 6i BZ. 

" * v [ vt Exp( Z .) ] Exp [(-t V e He" 

The value of p is independent of the time t and covariate effect 
Exp ( SZ.). Thus it is a scaling measure and u does not have an effect 
on the comparative values of and . m the following expression 
for the log likelihood, the terms involving u are omitted. 

n 8 z 

ln L = t f 6 i ln[ v t^ Exp(6 Z.)] + ( -t^ e *) 

BZ. 

2 8 . (In v + (v-1)ln t. + 8Z.) + (-t u e 1 ) 
i=1 11 i 



n BZ 
= n* log v + 6.[(v-1)ln t. + 6Z.] + (-fc| e S 



n BZ. 
In L = n* In v + 2 « ( v-.1 t + BZ.) + (- t v e Vfi.ln t. 

i=1 

Th 



e last part 5. In t. is not dependent on parameters and g . 
Thus in terms of the proportionality of the likelihood we let 
v \ 

R i ~ t i e and ln R i = v log + 6Z i 

giving 

n 

In L = n* ln \> + z 6 . (ln R . ) - R . 

i=1 1 

Now wf. obtain the derivatives of the logarithms of the likelihood. 
These derivatives at values equal to zero give the best estimators 
of the maximum likelihood function. 



3 ln L n v, 6 Z. n 



31n L n* n v 8Z i n * n 

= - + I 6. in t. + ( -t. c , . in h - £ + Z (5.- R.) 



i=1 



In t. = 



In the procedure for estimator of and v we also need to know the 
second derivatives of the logarithms of the likelihood. These values 
are used in the maximising procedures of the Newton-Raphson as well 
as deriving the information matrix to obtain the variance covariance 
esimators . 

3 2 In L n v 6Z i n 

3S j 3 'r = if 1 ' * 6 ' Z ii • Z ik = "J, R i Z ij 2 ik 

3 2 in L " v SZ i n 

36j9v ' ^ " t. e . Z. . . In t. = R. Z .. m t . 
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8 In L 
3v 2 



i=1 



BZ. 

- ( fe. e X ) . In t. . In t. = 
1 li 



n 



2 + 2 " R i( ln V 

u i=1 

The above functions are thus the necessary functions that may be 
used in conjunction with a standard Newton-Raphson maximisation. 
S.D. Silvey (1975) describes such a procedure. 



3.7. Interpretation of the models with use of the old Edinburgh 
Trial Data. 



In this section we perform an analysis of the old Edinburgh 
trial data, with the parametric methods. The general purpose is 
to give a comparative illustration of the parametric and non-parametric 
methods as discus-red in the last section. First we perform an 
exponential and then a Weiball model analysis with only one regression 
coefficient. 



The first covariate we test is the treatment effect, that 
is a comparison of survival times for simple surgery and radio- 
therapy against radical surgery. A t shape parameter value fixed 
to one we are in fact using an exponential model. Further the 
variation of the shape parameter from one indicates a Weiball 
model . 

Shape parameter set to one s = 1.4821 S.E. = .4021 d.f.= 559 

option 

Shape parameter estimated = 1.36 « = 1 .6321 S.E.= .4043 

° Ptl ° n d.f. = 558. 

(S.E. of the shape parameter = 1.18) 

We continue the estimation procedure with inclusion of another 
covariate/ 




covariate, the menopausal status. Once again we consider post- 
menopausal and menopausal as one category and the pre-menopausal 
as a separate category. 



Shape parameter 


option 


S.E. 


p 

meno 


S.E. d.f. 


set to one 


1.3819 


.4921 


1 .3722 


.3821 558 


estimated = 1.53 


1.3521 


.4185 


1 .3986 


.2581 557 


In both of the above 


models we 


note that 


option 


and menopausal 



status play an important role in describing the survival rate of the 

patients. Then we consider the addition of tumour size. 

Shape parameter 6 option S.E. R meno s.E. 6 size S.E. d.f. 

set to one 1.4181 .4931 1.3843 .3840 1.1927 .3192 557 
estimated=1 .54 1.4961 .4166 1.3506 .2931 1.1159 .2901 556 

Now we add a term for node status to the above models - 

B 6 6 S 

Shape para, option S.E. meno s.E. size S.E. node S.E. d.f. 

set to one 1.423 .4930 1.4134 .3872 1.3741 .3793 1.4721 .5128 556 

est.= 1.54 1.478 .4381 1.3902 .2881 1.1462 .3121 1.531 .6321 555 

The above models show that the survival differences of the patients 
can be attributed to the above covariate indicators. So far we have 
not considered significance levels of the different estimators for 
the parametric methods. In Chapter 6 *e put more emphasis on the 
analysis and interpretation of the data rather than a comparison of 
the analytical methods. In summary the above models indicaie that 
for the above covariates there is very little to choose from the 
exponential and the Weiball. The results of the Taublees family for 
the 2nd term, also show very similar results which, because of their 
close similarity are not detailed here. 



Now/ 



Now we introduce the concept of interaction and its use in the 
framework of a parametric model. In the second stage of the last 
analysis with the exponential and the Weiball models, the informatior 
from option and menopausal status played the most important role. 
One advantage in use of a regression model is that we are able to 
do a formal test of interaction effects. These tests assess if the 
effect of covariates acting simultaneously is any different from an 
addition of the two effects acting independently. Once again we 
represent menopausal status in two categories of pre-menopausal 
and menopausal + postmenopausal. The effect of the latter two 
categories of the menopausal status can be seen from the shape of 
the hazards, which are in fact very similar. Further, for the 
present purpose such a transformation of the menopausal status suff- 
ices. 



We begin with a model which was presented at above and 
included menopausal status and the treatment option as the only 
two effects. Now we continue with a test of an interaction effect 
for treatment and menopausal effects. 

Shape 6 option S.E. 8 meno s.E. 6 size s.E. d.f. 

parameter 

set to one 1.3839 .4938 1.2121 .3109 .2127 .3782 557 
estimated= 

1.52 1.3210 .4179 1.2382 .2052 .6171 .6312 556 
This result indicates that all the necessary information may be 
contained within the two main effects. We, therefore conclude 
that the radical treatment group perform better in terms of survival 
time. The effect of treatment is consistently the same for the 
various categories of the prognostic indicators, size, node and 
menopausal/ 



menopausal status. The behaviour of the various categories of 
the indicators is as may be expected. That is, the smaller tumours 
younger patients and the node negative tumours are the good prognosis 
groups and the older patients, larger tumours and node positive 
are the group providing the worst survival times. 

None of the main effects of the prognostic values show 
an important interaction with treatment effects. That is all sub- 
group variability of the survival times can be described in an 
additive manner. 



The final model of the Weiball and the exponential 
distribution with all three covariates and treatment effect included 
shows very similar estimators of the prognostic main effects in com- 
parison to models with treatment and one covariate effect included, 
thus once again suggesting that prognostic values are consistently 
the same given the present framework of the Waiball model. 

3.8. Families of distribution with covariate e ffects . 

The Taulbee or Turner family of distribution can provide 
a flexible set of distribution for use in failure time analysis. 
When we deal with covariates there is another approach to classify- 
ing distribution according to a combination of hazard rates and 
covariate effects. The most commonly used method is to assume 
that the population has a single underlying failure rate according 
to the inherent nature of disease. Further any difference in 
failure rates for the subgroup originates from a separately 
identified covariate effect. This class are termed on the proportion- 
al/ 



hazards model and the exponential , Weiball and the polynomial 
models of the previous section were based on its assumptions. 
An alternative useful approach is to consider the failure rate 
to have a function dependent on time and the covariate structure. 
This group is known as the accelerated failure time model and we 
consider them later in this section. 



The group of regression models with the assumptions of 
the proportional hazards are generalised as models of the form 

X(t, Z) = X (t) Exp(S Z) (3.8.1) 
Now if we let X (t) to be independent of time and set X (t) = ,\ 
we have an exponential distribution with covariates. Alternatively, 
if we let A-„ (t) to be time dependent with a shape parameter u and set 

A (t) = U v t V_1 (3.8.2) 
We have a Weiball distribution. In case of che Regleigh family 
of distribution or a restricted Taulbee approach we deal with 
hazards of the form, 

A„ (t) = X + X,t (3.8.3) 
In terms of the reduction of the data into a useful statistic, 
it is clear that estimation of the 8 gives the relevant information 
on effects due to membership of a particular subgroup. The assumpt- 
ion that must hold is that, the membership into a particular subgroup 
does not effect the sape of the X (t) . That is, there exists a base 
line hazard rate for the total population and any effects due to 
covariates take the form of a proportional effect introduced as 
Exp ( 6 Z ) . 



In the estimation of the maximum likelihood for the 
exponential/ 
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exponential and the Weiball, the actual scale of the hazard 
curve plays an arbitary role in the relative effects of the 
covariates. what matters is in fact only the shape parameter of 
the Weiball. The important assumption that must hold again due to 
the proportional hazards, is the fact that regardless of the 
subgroup, the total population must have the same shape parameter. 
In terms of interpretation we require that the base line hazard rate 
can be projected on to the subgroup rates for the entire population. 

Up until now we have considered a general proportional 
hazard model in which, the entire population has had the same base 
line hazard rate. There is an extension with which we can allow more 
than one base line hazard rate. However the information contributing 
to the covariate effects is inherently the same. These models are 
useful in situations that a population is composed of different 
strata. The information regarding membership of a particular 
strata is not testable, but information regarding some other covariate 
must be estimated by allowing for the strata effects. The model 
has the form. 

Vj (t, Z) = A (t) Exp (6 Z) 
where j refers to a particular strata. This functional form of 
x °_.(t), can in fact be used for any of the parametric models. The 

semi parametric model of Cox (1972) can also be used by the above 
definitions and interpretations. As an example we can allow 
different hazard rates for the different strata, say young patients 
and older patients. This topic in general is also related to the 
time dependency of the covariates and it will be studied later in 
Chapter 7. 



Apart from the proportional hazards model there is another 
group of regression models with a multiplicative effect on the 
regression parameters, namely, accelerated failure time models. 
The general formulation of the model is 

\{t, Z) = A ( t e" ZB ) e" 26 

For this model unlike the model discussed previously, the effect of 
the covariates under test can have a direct effect on the base line 
hazard rate that is estimated. it is important to note that both 
the proportional hazards models and the accelerated failure time models 
are log-linear models with additive effects of the hazard function, 
the covariates and the logarithm of the time. 

These models are most useful in terms of a generalised 
model for the estimation of the regression parameters. The method 
mentioned based as Turner's family of distribution is also useful in 
that it provides a useful way of classifying hazard rates. However, 
the main advantages of the proportional hazards compared to that 
of Turner's family or accelerated failure time is that the inter- 
pretation of events is much simpler. 

Ill Parametric, non-par ametrics and Cox's approach . 

In the above approaches and derivations an assumption has 
consistently been used in order to try to distinguish between the 
different survival rates. We kept the postulate that the time to 
a critical event is a random variable and that it can be explained 
by a continuous function. In the last chapter, however, the methods 
initially/ 



initially began with the reduction of the data into some form of a 
rank order. This reduction ultimately implies a loss of precision, 
in distinguishing the survival rates for the subgroup of the data. 
The advantage however in use of a non-parametric method based on 
ranks is that non-parametric tests are more robust. Extensive 
comparative studies of non-parametric and parametric methods have 
been done by various authors and we will deal with those in Chapter 5. 
The Cox's method which offers a practical compromise between para- 
metric and non-parametric methods is also considered in Chapter 5. 
We will perform simulations to assess small sample properties of 
the Cox's method for trial data. 



The analytical results of Chapter 2 and the present chapter 
have been based on the analysis of the old Edinburgh trial. As may 
be expected there are no qualitative differences in terms of the 
conclusions of the results. However there are slight variations by 
which we can reiterate the theoretical results of the earlier part of 
this section on hazard rates. The importance of parametric methods 
in here is not only that of precision alone, but rather due to an 
ease by which parametric methods are able to provide a conceptual 
frame for classifying the distributions of survival data into families 
of mathematical models. This flexibility to classify distribution 
is however compensated by a greater loss in robustness. Although all 
the families of distributions mentioned in this section provide flexible 
frameworks within which a large number of distributions for survival 
analysis are placed, it is difficult to imagine what may be done with an 
estimating procedure more complex than the Taulbee approach. In fact 
so far as a description of the progress of the disease matters, a plot of 
the/ 



the empirical hazard rates may suffice. If one is prepared to take 

the position, that hazard rates are mainly useful indescribing the 
biological progress of disease, then any robust general approach must 
lie somewhere between the non-parametric and parametric methods. The 
basic assumption then is that the actual rates of events are not 
important and need not be parameterised , but the difference between 
the failure rates in various groups must be estimated as precisely as 
possible. The final result will add to the robustness of the 
general method. 

* 

Zox (1972) presented a proportional hazard model by which the 
data is reduced to ranks and thus adopts an estimating procedure for 
which the rates of events are not important. The method offers a 
robust and flexible approach for the analysis of survival data and 
it is discussed in the next chapter. 
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CHAPTER 4 



Cox's Proportional Hazards Model 

For all of this chapter we will be dealing with the study 
of the method proposed by Cox (1972) for survival data. In this 
chapter we cover topics related to the usefulness of the method as 
applied to clinical trials. Some of the derivations from the original 

appraoch and the derivation of some of the central results are 
covered so that we may deal with the advantages and the disadvantages 
of the approach. 

There are a few major factors that distinguish the method of 
the previous 2 chapters form the proportional hazards approach. The 
latter method is more efficient than the non-parametric methods that were 
discussed in the last chapter. The method in fact allows comparison of 
the covariate effect to be made without making unduly restrictive assum- 
ptions. In relation to the completely non-parametric methods however 
it is more suitable for providing a useful conceptual model for consid- 
ering and testing the relationships of the effects efficiently in part- 
icular when several covariates are tested. Further it considers the 
relative effects of covariates as the relevant information for analysis 
and thus is more robust than the parametric methods where the require- 
ment is closer approximation to the survival rates for the various groups. 

There are certain requirements that must be satisfied in use 

of/ 



of the method. One is related to the proportional hazards assumpt- 
ion in a non-parametric setting. The other requirement is on the 
type of information that is available on each case i. We must have 
a set of covariates Z.(t), so that predictions on the survival times of 
the population may be made. The time we consider, from definitions 
of the previous chapter can be either time to the terminating event, 
e.g. death, or to the follow-up event e.g. censoring. However, the 
functional form of Z.(t) refers to development of the covariate process 
in the survival time scale. 

In the above discussion we mentioned the core of the topics 
of this section later we will consider these topics in greater detail. 

4.1 Development of survival functions . 

We use a similar methodology to that used for the parametric 
methods. S(t) is the survival function; f(t) is the density function 
and the hazard function is given by x(t). Such that if T is a random 
variable representing failure time, then for sufficiently short periods 
of time h, the hazard rate at time t is give by 

MO = lim r Pr (t < T< t + h \ T >. t) (4.1.1) 

h — * o+h 

What the above basically implies is that the rate of failure is the 
conditional probability of an event at time t, given that, the individ- 
ual has survival until a time immediately previous to it. For a contin- 
uous distribution we mentioned a similar definition in the introduction. 
However we now have a situation in which time T has a discrete 
distribution/ 
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distribution and observed times have values t, < t ? < . . . t 

J- £ 

It follows that, 



n 



f(t) 



Pr (T = t) 



S(t) 




j \t. < t 



& x(t) 



Pr (T = t \ T> t) 



(4.1.2) 



The above formulation of (4.1.2) is an extension of previous 
definition of (4.1.1) with the difference that T is now discrete. 
The theoretical distinction between discrete and continuous forms o f 
the hazard rate does not prohibit extension of the proportional hazards 
to a discrete analogue. For the above distribution with a covariate 
set Z a corresponding survival function is 



Where S Q (t) represents a base line survival rate at Z=0 and has a 
corresponding base line hazard rate given by x Q (t) . We will return 
to the above formulation in section 4.4 for the construction of the likeli- 
hood. 

In the context of the general proportional hazards model we can 
express the hazard rates as, 



Where all the relevant information regarding the difference for survival 
rates is decomposed by the relative rates of failure in the r(Z, 6 ) fun- 




-j Exp (Z 8) 



X(t, Z) = x n (t) r (Z, 6 ) 



(4.1.3) 



Cox uses an exponential decomposition of the r(Z, 8) giving 



\(t, Z) = \ (t) Exp ( 8 Z) 



(4.1.4) 
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For the base line hazard rate of the proportional hazard model Cox 
uses a discrete form of X^(t), based on ranks of times. The aim is 
that by use of this form of base line hazard rate robustness may be 
introduced into the model, for the estimation of what remains relevant, 
i.e. the relative risks. The base line hazard rate is a form of a 
nuisance parameter and we will deal with nuisance parameters later. 
As to the interpretation of the *-g(t) within the statistical theory, in 
chapter 2 we showed a maximum likelihood estimation of the Kaplan 
and Meier estimation and they are essentially the same. The values 
of the (> estimators are similar in interpretation to the case of the 
parametric models. 



The derivation of the Kaplan and Meier estimates as maximum 
likelihood estimators justifies the use of a discrete distribution and a 
parametric decomposition . In this form of the proportional hazards 
model, the discrete form of the base line hazard variability is removed 
and the data is transformed to a base line of Kaplan and Meier 
estimates. Cox (1972) discusses both discrete and continuous failure 
time data and shows a unification in the approach by which both the 
discrete and the continuous cases can be accommodated in essentially 
the same way. The term ^g(t) is a transformation of survival times 
in to the rank based product limit estimates for the hazard rates. 
Thus we have ranks t,,, < . . . . < t,, , 



I(t) = % ^-«(t - t m ) = "°- ° f deaths at I.f. 



i=l r (i) 



(i! 



no. at risk at t,. s 
(i) 



S ( 



t) = i f 1 -^Vl = n 

t (i) <tL M J t (i)< t 



i - 



m (j) 
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The function 5(t - fc.,.) represents a dirac delta of values or 1 . 
It is 1 in case of a failure at t^ and elsewhere. ra refers to 
number of events at rank (i) and tv.. referes to number at risk. 
This is a generalisation of the Kaplan and Meier estimates. In order 
to avoid problems associated with censoring times tied with failure times 
we adopt the convention of letting censoring occur just after the failure. 

Now, regarding the relative risk part of the equation (4.1.3), 
we intend to categorise our population accoring to a set of measurements 
available on the patients. The measurements in this context are refer- 
red to as covariates. The s's are values that must be estimated and 
they provide information on the effects of covariates. Once again 
similar to the definitions of chapter 3 we refer to s's as the regression 
parameters. 



In equation (4.1.3) we separated the effects into r(B,Z) and 
a time dependent function ^(t). Depending on the form of the 
covariate effects it is possible that the explanatory variable Z. , be 
also a function of time. That is the contribution of the covariate is 
allowed to be a random variable, that changes with time, so a formula- 
tion such as Zj(t), ... Z g (t) may be more appropriate. If our 
population consists of n patients anc s covariate measurements, then a 
s x n matrix set as follows can define all auxilliary information. 
(Information apart from death not censorings). 



z u (t) z 21 (t) 
z 12 (t) z 22 ( t) 



z *l (t) 



(4.1.6) 
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Thus a general form of the Cox (1972) proportional hazards model with 
regression parameters is given by 

A(t, Z(t) ) = Ag(t) Exp ( Bid:) ) (4.1.7) 

Before proceeding with the discussions of the various assumptions 
necessary for the estimation of the regression parameters, g , we discuss 
the role of X Q (t) in the framework of the model. 

4.2 Role of the Nuisance functions and the relative risks. 

Meaningful isolation of relevant information is the major 
intention in much of statistical work. This intention can be achieved 
at times only by estimation of parameters that specify a distribution. 
In some complex processes we require a reduction of the data in a more 
elaborate manner. 

The figures (4.2.1) and (4.2.2) present the survival rates and 
the disease free interval for a group of (337) patients who were entered 
into a randomised adjuvent chemotherapy trial, in the South East of 
Scotland for four years from 1.4.74. Our purpose in presenting these 
results is to consider the relevance of the proportional hazards to such 
studies. A comparison of the rates of failure by survival A and B is 
sufficient in giving relative rates of failure. However a more robust 
and thus a less restrictive estimating procedure may be achieved by 
realising that the relvant information in terms of the difference between 
survival rates of A and B in either figure is in fact contained within 
the shaded region C. Thus the information relating to shape of A or 
B at times need not play a significant role in the interpretation of the 
data, 
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Parameters relating to actual shapes of curve A & B are 
referred to as nuisance parameters. In model (4.1.3) , * (t) is a 
function that must be estimated, in relation to effects of covariates Z 

i 

and its estimator is a nuisance parameter. In fact a set of sufficient 
statistics for the estimation of parameters generating the shaded region 
is composed of the S parameters of the Z covariates. Thus in the case 
of a clinical trial we perform a trial L, and obtain a data set (L, Z) 
where for each element of Z a measurement has been made to assess its 
value for aparticular patient i. 

In the most elementary form of applying the probability 
theory, we have 3 general abstractions. A sample space X which is 
the set that conclusions refer to, a subset of X which is the total data 
set, a reduction of X, given by the model M, and a further abstraction 
P, which is a probability measure on model M and represents a form of 
variability of the data from the model. 

In the context of models of survival time P is in fact composed 
of a subset ? z for each particular covariate and a P R for the hazard 
rate. In an ideal situation we would like to have a one to one mapping 
of Z"*P Z for each covariate. In the case of a trial with the model 
(4.1.3) form, the above restriction would rquire specifying the distribu- 
tion of the survival function and the covariates for each form of risk 
that depends on the covariate set. However for reasons of generality 
and robustness a reduction is made. If in a trial the actual form 
of the hazard rate for each competing risk is not of interest, then the 
two subsets of P, namely P z and P H can be defined as follows. ? 2 
relates/ 
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relates to a probability measure in terms of relative risks of covariates 

and P H to probability measures for the hazard rates. The nuisance 

function in proportional hazards models is related to P . 

H 

There are certain points that must be considered in relation 
to the nuisance parameters. The actual trial and the way it is planned 
plays a role for maximising the support we obtain from the data. Since 
the relevant information is related to the covariates rather than time, 
we can maximise this form of information by the usual procedure of 
randomisation and possible stratification of prognostic indicators and 
treatments. The maximisation of support in no way needs to be related 
to a time factor. The model achieves its robustness by transforming 
the time scale into a rank order, and thus the new scale is sufficient 
to measure the amount of support the data gives to various values of g. 
The maximum likelihood approach provides a setting for optimising 
these values and hence obtain the various required estimates. This 
data reduction in Cox's approach as described so far requires the 
proportional hazards assumption. That is we expect the hazards 
for the subgroups to be multiples of a base line hazard. In the above 
paragraphs we discussed the issues related to nuisance parameters 
and some of the necessary assumptions that are related to it. The 
expansion of the above can include time dependency of hazards, 
multiple competing risks, censoring and stratification of the data, 
when this type of model is used for analysis. Later in this chapter 
and in Chapter 7 we will return to these points. 



Continuing for the moment with the proportional hazard 

situation , 
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situation, the relative risk part of the model provides the necessary 
framework for extraction of relevant information. Here 8's must 
be estimated and they give a representation of the dependence of the 
distribution of survival time T on the subgroups. The covariate set 
Z provides the necessary information on treatments, categories of 
prognostic indicators or some other measurements that are considered 
to be relevant at the beginning of the study. g is a 1 x S vector 

and it follows that one element of 8 has to be estimated for each Z 
(j = 1, . . . S) . The actual functional form of r(Z,6 ) is oftentaken 
to be of the type Exp( 6 Z). Cox adapts the above exponential 
decomposition but also allows Z's to be time dependent of the form 
Exp ( 8, Z(t) ). By allowing the time dependent form of Z(t) to 
operate we are in fact allowing the data to generate a model with 
non-proportional hazard assumptions. This inclusion of time dependent 
covariates allows us to assess and test if the effect of certain prognostic 
indicators diminishes over time. It is difficult to separate topics 
such as time dependency, censoring with dependent effects, and 
competing risks when some of the withdrawals are in fact events due 
to other causes. 

Thomas (1980) concentrates on the functional form of 
r(Z, 8) and produces a set of relative risk functions e.g. 
1 + (Z x 8), 1 + (Exp( 8 ) x Z) , etc. 
Gore (1981) considers an exponential decomposition form of 

Exp( 1 e~ P 1 t Z 1 + 8 2 e" P 2 t z J 
Kalbfleisch and Prentice (1972) suggest time dependent covariates 
such as 

Exp ( BjZ, * % t % l ) 
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Sffron (1977) shows that r(Z, g ) can in fact be any positive function 
and uses a logistic dependence form ratio log(l + exp( g Z) ) 



4.3 Limitations and assumptions of the model. 

In the formulation of likelihood functions we have used 2 
functions and considered these to contain relevant information. They 
are namely X(t) and r (2,3 ). A fuller likelihood may contain an 
extra function given by 

likelihood = X (t) . r(Z, 6 ) . y( B,Z, t) (4.3.1) 
The form of the hypotheseis of Gore (1981) can also be tested by an 
expression of the form 

X(t) . Exp(Z 1 b 1 + Z 2 b 2 ) . Exp(-Z 1 E^t - Z 2 g 2 P,t) 

-P.t P.t ? -P.t 
given that e 1 = 1 - P.t +(-!_)¥ . . . . + ( L_ ) n 

and that powers of greater than or equal to two are of negligible 
effects . 

A slightly different analysis may use a model of form 
Exp (Z B l + Z S 2 t * + Z B 3 t* 2 ) 

where t* = (t - t) / ^ 

It is interesting to note what kind of effect is produced by weighting 

the relation between Z and t differently in a family of transformations 
(a) 1/ a 

such as y = y and ln(y) where y is a function of t. 

For example a substitution for Y ( 6 , Z, t) in (4.3.1) may give 

SZ ln(t) or 6Z (t) 1/a 
We/ 
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We will now examine the above time dependency concepts for the 
family of the Weiball distribution as described in Chapter 3. The 
survival rates for a two group sample can be expressed as 



S. = Exp [- 



Y . 



* a. t 1 e 1 dt]=Exp 



Y + 1 
-a . t 'i 
i 



+ 1 



The above is a proportional hazard expansion of Weiball distribution 

however for a time dependency effect we will allow ct Y . as well as 8 . 

v 1 l i l 

to depend on group membership. 

Y + 1 
- a. t 1 6. 

In ( S) = L__^ e 1 

1 Y- + 1 



For an expression of the Lehman alternatives we have 



In ( S r ) 
In ( 



= C 



Yi -y 2 te r 6^ 



for C = 



Jl ( Y2 + 1) 
o 2 ( 71+ n 



_ c e (6 x - H (Yi" Y^ In (t)) 
The relative risk expression may thus be expressed as 
Exp. ( Y * t In (t) Zj + ^ Z x ) 



where y * and g| are parameters that must be estimated and Z^is the 
indicator of the subgroups. The ln(t) transformation will thus prvide 
a natural scaling for the range of the Weiball distribution. 



They fe ,Z,t) part of (4.3.1) in fact contains no relevant inform- 
ation if we deal with a proportional hazard situation in which cases 

are/ 
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are monitored continuously and censoring is non-informative. More 
important however is a situation where there is a need to test the 
effect of a covariate according to the time scale. An example is a 
test for assessing the persistancce of a prognostic indicator as an 
indicator of short survival. In practice measurements on patients 
are done in the beginning of the study and thus time dependency of 
the covariates may be assessed by the above function. We list some 
of the theoretical problems that may give us unwanted assumptions. 
The list is not composed of a set of mutually exclusive topics and the 
severity of the assumptions is not often significant in trials. 

(1) There exists a minimum observation time and depending on this 
minimum observation time some information regarding censorings may be 
lost. Also in some studies the minimum practical observation time may 
not correspond to the minimum observation time at the analysis. For 
example, recording of death may be correct to day of death, but analysis 
is performed in weeks of survival. 

(2) One further implication of the existence of a minimum observation 
time in (1) is that the data is discrete and some ties may be present 
in the data. 

(3) Time between ranks are assumed to be non informative. (Kalb- 
fleisch (1980) considers a Baysian approach with Gamma prior distri- 
butions between ranks. 

(4) Censoring times may be informative with respect to certain covar- 
iates. A related situation is where a second cause of the event is 
recorded. The problem is that an auxiliary cause of death, if included 
among censorings may cause censoring patterns to be informative, in the 
sense that by excluding or including a particular cause of death from, 
or into the event set, inconsistant conclusions may be possible. 

(5) / 
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(5) Effect of treatment or covariates may not be consistent in time. 
The r(Z, e ) relative risk assumes that a single function independent 
of time is sufficient. The topics of (4) & (5) refer to trials in which 
the time variable interacts with covariate effects. In such cases 
although we can assess the effect by the r (Z, 6 (t) ) part of the model, 
the conclusions are limited unless we move to a multivariate competing 
risk model. 



We illustrate the above points by the following example. 
The figure (4.3.1) represents possible outcomes that may be recorded 
for a case. C refers to censoring, D to an event of interest say 
death and to some other event, or auxiliary event. 

For a situation that all assumptions hold we require the 
concentration of events to have the patterns of (4.3.2a), (4.3.2b), 
(4.3.2c) figures for the censoring times, death times and the other 
auxiliary event respectively. The shaded regions refer to areas with 
higher concentration of events. The censoring and auxiliary event 
in (4.3.2a) and (4.3.2c) are uniformly and equally distributed and so 
do not provide useful information. If on the other hand the auxiliary 
event was somehow related to lost to follow-up because of the effects 
of treatments, or the event of interest is metastatic disease and 
auxiliary case is death with no previous metastatic sign, then instead 
of (4.3.2c) we may obtain distributions such as (4. 3. 2d), (4.3.2e) 
and (4.3.2f) for the auxiliary events, where the events for treatments 
A and B are not uniformly distributed. We will return to this topic 
after more development of the mathematics. Further in Chapter 5 
we will study the implication of some of the above assumptions in small 
samples/ 




Figure (4.3.1) 
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Figure (4.3.2) 



-amples for a realistic simulation of clinical trial data. 
4.4 The construction of the likelihood and its properties 



Now we consider the methods for the estimation of the regression 
parameter and the construction of the relevant likelihood equations. 
Censoring is dealt with in the manner of Chapters 2 & 3. We observe 
minimum of either T., the failure time, or C. the censoring time. 
The above statements can be expressed as 

(T. < C.) => Failure 

(T. > C.) s=> Censoring time precedes failure time. 
According to definition of proportional hazards we have 

S q (t) = Exp { - J 1 X ( u ) du > 

o 

(4.4.1) 

and S z (t) = Exp{ -J Exp ( 8Z) X q ( u ) du > 

o 

giving S z (t) = [ S o (t) ] Exp ( SZ) (4.4.2) 

which is an example of a Lehman Alternative, by which a reduction 
of relevant information may be made by the ratios of the two survival 
distributions. A simple example is to take a single covariate case with 
treatment covariate Z, set to 1 for new treatment and Z set to for 
controls. Hence 

S x (t) = [ S q (t)] Exp6 (4.4.3) 

Thus the two survival distributions are related in a multiplicative 
manner. The relation between a function of the ratios of Sj(t) 
&/ 
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& S Q (t) is equivalent to a constant transformation of g and does not 
involve the time factor. In other words S Q (t) can be projected onto 
S,(t) and by a function of B . The general aim of the derivations 
of this section is to estimate the values of g's with nonparametric 
hazard of the form x Q (t). Although the method considers X Q (t) and 
functions of 3 the method can also be used to generate a transform- 
ation of X (t) to estimate the survival functions, 
o 



In the original approach Cox, adopts a conditional argument to 
construct the likelihood. At any moment in time, there exists a 
particular risk set R(t,.j). Any failure at the unique i ' th in time 
namely t^must have arisen from this set. Therefore, probability of 
failure at fc^-* given the risk set R(t^) ( or given survival up until 
t (i) ) is 



L. = Exp UZ.) / E £xp ( } (4.4,4) 

*R<t (i) > 

For the population of size n we have the likelihood function to be 
composed of the following function and allow censorings to occur without 
contributing to the likelihood. 

" , T ,di J r Exp( B Z.) / , 
L = n [ Lj = I I v i r T7 i 7i i d : 



1 £ Exp( 6 Z) P1 



(4.4.5) 



for d. = censored 
i 

d. = 1 Death 
i 

(4.4.5) refers to a situation where no ties are present in the data. 
Later/ 
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Later we will consider a more general risk set for dealing with more 
than one event at a given time. 



Now for a general likelihood (4.4.4) & (4.4.5) with the 

proportional hazard assumption we generate a population of size 5 

and allocate the likelihood factor contributions, as in table (4.4.1) 

Rank Survival time Censoring = Z. Contribution to. 

Death = 1 1 Likelihood [L.] i 

1 1 1 J/3 + 2 exp(S) 

2 5 1 1 exp(6)/2 + 2exp(B ) 

3 10 1 [exp(B)/2+exp(8 )]° 

4 20 1 1/2 

5 30 1 1/1 

Table (4.4.1) 

On taking logarithms of the Cox's conditional log likelihood function 
of survival time we have 



In L = Z 6 Z. - Z log { L Exp( 0Z,) } (4.4.6) 

Deaths 1 Death Risk set 1 

i i at time t^.j 

The argument is straight forward with conditionality and a single 
factor as in equation (4.4.4) for the case i. 

However Peto (1972b) raised some related points regarding treatment 

of ties and censorings. Later Kalbfleish and Prentice (1973) chalenged 

this terminology since the conditionality for a single case does not carry 

over to the full set population. As is clear from Table (4.4.1) , the 

3rd survival time is equal to 10 and is a censored time. This implies 

that there is no contribution for this case to the likelihood function. 

The Cox's conditionality argument makes an extrapolation on the 

state of failure of the remaining set. This extrapolation is related to 

the/ 



the pattern of ties in the presence of possible censoring and assumes 
that the distance between the events does not contain relevant 
information. The extent of the assumptions can be judged by con- 
sidering the types of events that can occur. 

In reducing the time of events into ranks we produce 3 types 
of observable events. Within any ranking point say ith to ith + 1, 
for the risk set at t^ the information on time may contain any of the 
following 3 groups. 

First class are those present at the beginning 
and the end of the time period. The consequence 
to the likelihood is that full information is contain- 
ed within the risk set 

Next class are those that die within the period. 
The consequence to the likelihood is that maximum 
information is contained if deaths occur just prior 
to i+1. Clearly a death can occur anywhere within 
the minimum observable time. In here we have 
considered death to be the event of interest. 



member of R (t^) 



i+1 



D 



Finally the group that are not present during 
all or part of the period. That is cases with 
death at t^ and cases with censoring at t-.v 
to t (j+]^)- ^ n this situation the consequence 
to the likelihood is made most realistic by ranking 
censorings after deaths. 



D 



C 
4E 



149 

Next we will show how the general likelihood is a deviation from 
the full likelihood. Later we will mention situations where the two 
likelihoods give close estimators. We begin with an explanatory 
definition for (4.4.4) 



exp ( ez (i) ) 

L i - 2 — exp( 8 Z ) — ' = Pr ( Individua l (i) fails\risk set at 
1(R 

t^& 7 t,^ has at least one death) 



= Pr(Death at all previous cen- 

soring and present failure information 
at t (i) ) 

The last expression is in fact the set of sufficient information necessary 
to obtain column 5 of table (4.4.1). 



Any L. for a failure at is more generally conditional on 

"past history", which was expressed as risk set at t^.and the fact 
that the event is a failure. Thus sufficient information for "past 
history" is, 1 to (i-1), censoring and failure information + the 
ith censoring information. In other words the probability regarding 
death at t,.* is made conditional upon the information regrading the 
occurrence of all previous deaths and censorings, and also including 
the information regarding censoring at current time. The probability 
regarding a censoring at t^ is conditional upon the information 
regarding the occurrence of the previous deaths and censorings. 
This is expressed as 

L. = Pr (Death at t.\lto (i-1) censoring & death & ith censoring) 
& L. = Pr (Censored at t.\ 1 to (i-1) censoring & deaths) 
The/ 



The full likelihood is a combination of the above and using the 
previous definitions, by the general arguments of the survival 
analysis we take the full likelihood to be 
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L ~ Deaths < Hazard ) • A ft (Survivals) (4.4.7) 



Let D = [ i | -y a] be no. of 



cases prior to the last death 



L = n X (t.) e : 
i = 1 1 



BZ.(t.) n 

n exp 
i=l 



tj S Z (u) 

*..(u)e du] 







(4.4.8) 



The above is from (4.4.7)and (4.4.1) 



In a population of size 3 with the first two cases failing at t & t 

and the third case censored at c 3 we obtain, on expansion of (4.4.8) 
such that for 1=1 we have 



L = X ( tl ) e 



BZj(t) 



h ez (t) 

exp l - j M u ) e du] 




for i =1, 2 we have 



L = X (t x ) e 



gZ^t) 



X, (t 2 ) e 



sz 9 (t) 



6 Z n (t) 



exp [ -I V ( U )e 



du] 



tj 6 Z (t) 

- X (u)e 6 




h sZ ? (t) 

du] exp [- x (u)e 6 du] 



: : -.v - have 



3 Z,(t) 



J *«<u) e * du] 
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exp 



,t 2 3 Z 2 (t) 



X.(u)e du] expi- 



f tj B z 3 (t) 

\ (u)e du] 



'1 



f t 2 6 Z 3 (t) 

. exp[- X (u)e du] exp[- 

fc 1 



3 BZ 3 (t) 
x£u)e du] 



By rearanging the exponential term for each integral period 1 



t. 



t. 



H 



we get 
D 

n exp 
i=l 



t. gZ.(u) 

X (u) i e J du] (4.4.9) 



t. 



i -1 



The rest of (4.4.8) are the contributions of the deaths, and is the 
first part of the equation, given by 



D 

i 5 1 X (t.) exp [ S Z.. (T.)] 



(4.4.10) 



Note (4.4.8) is (4.4.9) & (4.4.10) 
By combining (4.4.9) & (4.4.10) we obtain 



L = 



D 

£ ( [exp T- ^ xiu) 



Z e 



'Z.(u) 
j du 



] M% 



B Z(t.) 
; ) I e J 



exp (s Z (T ) 

» ) 



2 S Z(t) 
j*R e j 1 



(4.4.11) 

£ 8 z.(t.) 

Note that the above is equivalent to (4.4.8) with the term . D e J 

J re- 



introduced to the equation. The part (2) of the equation is clearly 

the/ 
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the usual partial likelihood and part (1) gives the extra contributions 
for the full likelihood. If X ^ t ) is unknown then part (1) provides 
little information about b's. Thus (4.4.11) reduces to the likelihood 
(4.4.5). 

Now we will return to the generalisation of the closing parts 
of section 4 .3, using the above mathematical notations. The parts 
(1) and (2) of the full likelihood (4.4.11) have two time quantities, 
X (u) and the other Z(u). We can often assume that part (2) of 
the likelihood does not contribute to the information on covariates. 
This is a true assumption by an independency of Z(u) from X (u) in 
the integral of t. , to t. 



The problem of tied observation was treated by Cox (1972) in 
the following manner. 

Say two observations are tied a & b. Due to the fact that we do not 
know the order of these events, the actual probability contributed to 
the likelihood is 



exp ( BZ ) 
£ exp( 6 Z. ) 



exp ( BZ^) 
E exp( B Z.) 



exp( BZ b ) 



exp( 6 Z ) 

3b 



Z exp( S Z.) Z exp( 3 Z.) 

R 



Cox's approximation is 



n 2exp( B Z.) / 
i = a , b 



. ( 2exp(8 Z.))- exp ( BZ.) 
i=a , b 3 1 



Peto (1972b) suggested an approximation to the tied ranks distribution 
and/ 
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and later Kalbfleish (1972) referred to this likelihood as marginal 
likelihood. He pointed out that this assumption prohibits use of 
time dependent covariates. The expression for the above is 



n exp ( 8 Z ) / r i 
i=a,b / rj< I exp ( 6 Z.) / r ) m 

L J j£R ] 



where r is no. at risk at time ties have occurred and m is no. of 
events tied. 

By use of the approach proposed by Cox in dealing with ties 
the calculations become exceedingly cumbersome. The ratio of 
calculations in fact multiply as the number of ties increase in the 
sample. However in our study this method is used mainly because 
the use of the alternative approach implies the prohibition of the use 
of time dependent covariates. The partial likelihood of (4.4.11) can 
be expressed as a function of the log likelihood of k distinct deaths 
as 

k 

InL = L( s) = E [ B t. - In ( I exp( e Z.) )] (4.4.12) 
i =1 1 j«R. ] 

For use of the maximisation method of Newton-Raphson we first require 

two derivatives of the likelihood with respect to 6. Different 

derivations of the likelihood may introduce different restrictions on 

the form of Z^(t). However the following can be obtained without 

loss of generality. „ , ' 7 , ™ 

1 2 exp ( 8 Z.) . Z. 

k i/SR J JP 

lUSL.lt ..-JS . (4.4.13, 

66 i=l F S exp ( 6 Z.) 

j*R. ] 

J 



Thus/ 
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Thus we want solutions to 



i 



Z. exp (B Z.) 
■JP F 1 J 



k 

i (z. - 

i=l 1P 2 exp ( B'Z.) 

jeR. j 



) = o 



The equation can be solved by Newton Raphson procedure and use of 
the following 2nd partial derivativ.es. 



L( B) 



58 66 

p q 



= - S (. 
i=l 



J4L 



Z. Z. exp (B Z.) 
JP jq H J 



( ^ Z exp(B Z ))( 2 Z. exp ( S Z .) ) 



IP 



jq 



2 exp (B Z.) 
.jeR. 3 



■) (4.4.14) 



We can thus estimate the B values in (4.4.12) and thereby assess the 
effects of the Z concomitant variables. Using the above derivations 
we will now proceed with a few commonly used testing procedures. 



For testing the global null hypothesis that all coefficients 
are identically zero Cox gives the the efficient score statistics of Rao 
based on 

Q = L" (0) I -1 (0) U (0) 
where U is a vector of all first derivatives given by (4.4.13) . I is 
the information matrix and is composed of elments given by second 
derivatives as in (4. 4. 14) . Q has a chi-squared distribution with r 
concomitant variables and v degrees of freedom. In studies where 
r/ 



r is large like the situation of most clinical trials, Cox suggests 
the use of significance tests for subset of parameter estimates. 
The two commonly used tests are the asymptotic likelihood and the 
asymptotic normality tests. 

For the likelihood ratio test with one degree of freedom 

we have, 

-2 (L ( 3 j - L ( 6^ ) 

where 8 a and 6 fa are vectors of parameter estimates that are 

included in the likelihood model. g in fact spans a space which 

is a subset of B b and the two often have a dimensional difference of 

one. The test then has a chi-squared distribution with one degree 

of freedom under the null hypothesis that the concomitant information 

missing in the likelihood of B has an estimator zero in 6 . 

a b 

With the assumption of asymptotic normality for a one-sided 
a * significance level we have 

-1 * I 
Pr ( B / [1(B) ] > t ) = a 
P Pa 

where t Q refers to the percentage point of the t distribution with a 
significance level. The above tests are used extensively in the 
simulation studies of the Chapter 5. 

In the final part of this section we will show that the first and 
the second derivatives, namely (4.4.13) and (4.4.14) are in fact tra- 
nslation invarient. This result is of interest when we consider 
transformation / 



transformation of the covariates for the fast convergence of the 
iterative methods. 



The values of g are translation invariants under a 

translation of Z to (Z + a) where a is any vector of constants. 

At this stage we substitute values of (Z. + a) with Z in the 

1 J 

equation (4.4.13) & (4.4.14) and show that the ratios remain the 
same. We have 

\ \ ti . + a — ■ i 

i=1 IP P ' 2 - ' 

jCR . exp ( 6 [Z. + a]) 

= ka + t I t - 6XP(6 ' a) ( i^, Z iP ex P (B ' Z ! + a D i 'R.«P< B ' z » 
P 1=1 J P exp(6' a) £ D exp(B' Z.) 

k .£ Z. n exp( 6 - Z.) aw ^{exp^g' Z.))] 



ka + ?, [Z . a ip y. a p;.^ 

i-i id ' p , . _ . r : 

,exp ( e ' Z. 



k Z. exp(g' Z.) 

. £( Z. ]£Ri IP— 1 L 

1=1 ID _ . . . _ 



k 

ka + 

9 1P 



) - ka 

D 



= .Z. ( Z. - J tK i JP L_ 

1=1 ID E 

j€R exp (g- Z.) 



and hence by letting the equations once again to be set to zero, 
we will have the same values for the g estimators. 

For the 2nd derivatives we hava - 
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3 2 L(g) a 

p q 



k 
Z 

i=1 



(,h exp(6'z .)) ( Z z. Z. exp(S'z.)) 
i i 

C jI R . Z jp ex P < 6 ' Z j>) CjI R . Z jq exp(6.z.)) 
("J" exp (B-z.j) 2 



= (*) 



k 

- Z 

i=1 



again using the same substitution we get 

(.2 R _exp(8'Z.)) (^(Z^ + y(% 4 V Ex P< B ' 2 j>) 



r 28'A, 

e 



e 2B ' a ( £ exp(B' Z.) ) 2 



- e 



2 6' a 



( j<V Z DP + V EXP ^' Z j^ ( + V EXP(6 ' B j3 



2 8' a , 2 



( jtR _Exp(8. Zj )/ 



p _2 R Z jq exp(6'Z.) + a q ^ p z jg exp(B'Z) +a P a q .J exp(8'Z.) 



1= I - 



-(a fe exp ( 8'Z .) J ( Z Z. Exp( 8'Z.)) + a (Z Exp(6'Z.)) 



(.^Exp^z.) 2 



158 



The above results indicate that the a translation of Z to Z + a 

j j 

leaves the function of the second derivatives of the log likelihood 
with respect to 6 values the same. 



As was mentioned the functions of second derivatives are used 
in the estimation of the variance of the 6 estimators. It is defined 

Us 

to be Var ( B) = (~ ) -1 The value of the second partial 

p q 

derivatives are also used in the estimator procedure of Newton Raphson 
where a function is formed to obtain a convergence of the equation 
(4.4.12). 



The method is iterative and it spans the likelihood surface until 
it finds the required maxima. The rate of convergence to the maxi- 
ma depends on the slope and shape of the likelihood surface. 
Primarily the rate of convergence is slow if a number of covariates 
have a large scale range and these show a degree of correlation. 
The consequence is that the variance covariance matrix at inversion 
will have a determinant which is almost zero. The problem of scale 
range can be remedied by subtracting the mean value of the covariate 
effects from the covariate if they have a large scale range. 

In Chapter 6 we will use the above and alternative methods 
based on the categorisation of the continuous variables. 



4.5 Covariate interaction and time dependency. 



In/ 
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4.5 Covariate Interaction &- Time- Dependency . 

In this section we will consider the proportional hazards 
model in terms of possible functional forms of the relative risk part 
of the model. We will describe the possible functions that may be 
useful and efficient in an analysis of clinical trial data. We will 
relate these functions to appropriate hazard rate patterns and in 
the future chapters some of the topics and models of this section 
will be used in analysis and interpretation of the results. In 
here we will keep the definition of x Q (t) to be as that of previous 
sections of this chapter. The r(Z, B ,t) function measures the 
relative risk differences in relation to the base line hazard and 
the projected subgroup hazard rates. We reiterate the point that 
this difference may be due to various forms of time-dependency or 
purely due to fixed covariate effects. This distinction is not in 
most circumstances very clear especially in an exploratory analysis 
or a situation of measurement over time. This effect may be referred 
to as time confounding and is related to the influence of the various 
covariates on each other within the time scale. An example is a 
situation where treatment effect comparisons may show a different 
relative risk pattern for younger patients and the older patients. 
Such an effect is testable by a complete model of age and treatment. 
A different approach may attempt to test the adequacy of the func- 
tional form of r(Z,e ) by inclusion of a treatment and time dependent 
covariate based on the time scale itself. Much of this section is 
related to various developments of r(Z, g ,t) and the way it can 
influence, time dependency, interaction and confounding. In a 
situation of stratified analysis of the data with say an exponential 
decomposition/ 



decomposition of the relative risks we have 
X k (t,Z) = X ok (t) Z Qk Exp( Z 1 & 1 + Z 2 B 2 f . . . ) 



Where Z Qk is set to be dummy variable and conditions the analysis 
on the strata of interest k. 

That is 

1 case belongs to strata of interest, k 



Z ok " { 



case does not belong to the r el evart strata k 



In effect by repeating the analysis for the various strata it will 
result in a different base line hazard being produced for each strata 
set. The significance of this point is merely attributed to the 
method by which stratified analysis may be incorporated into the 
general procedures. The resultant effect on the partial likelihood 
argument is the introduction of a conditionality parameter such that, 

Pr (subject i failing at t.\ presence until c i or t. and also 

membership of strata) = Z . r(Z, g)/ I _ .„ 

° k 1 ' k Z ok r(Z l V 

ok r(Z l V 



\* r(z rV 



where k* is the new risk set and excludes all cases not belonging 
to the particular strata. A point that must be noted is that 
Z ok function in the above may be interpreted as a function adjusting 
x ok (t) rather than one acting on r(Z,g). 



Then/ 



Then 

m , . , 

x , (t) = [1 — Ml ] Z , 
ok r... ok 

(i) 

An example is a situation of separate analysis for the older patients, 

such that 

if age > d then Z , = 

ok 

and age ^ d then = 1 , where d is a constant on the age 

scale. 



Now we consider a situation where is related to a categor- 
ised separation of a continuous variable say time or size. Suppose 
we set Zj = (-1,0,1). We then test the effect of Z. , with above 
categorisation assumption, that the relative risk at the lower level 
of Zy= -1 is related to the middle level of Z^ = by the same scale 
which relates the middle level of Z^o the higher level of Z^= 1. 

A more elaborate analysis will allow the 3 levels of Z^to act independ- 

2 ? 
ently by introducing (Z^ such that Exp (Z^ &l + 

In the case of interaction effects being present in the data between 

the two covariates we may have expressions of the form 

ex P (z i8 j ♦ z 2 3 2 + z 1 z 2 e 12 ). 

Under this assumption we are testing the multiplicative effect of 
Z^ & Z^ on each other and on Xq(0. i.e. the relative risks. 

Exp (Z 1 B 1 ) ° Exp(Z 2 s 2 ) . Exp(Z 1 Z 2 \ 2 ) 
For an actual trial we can represent the various subgroups for 



say/ 
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say, treatment A and treatment B and node positive and negative 
groups such that hazard rates for the two treatment groups are 
/ 

X Q (t) For patients of group A treatment 
X Q (t) e 8 For patients of group B treatment 



x(t,z t ) = 



The common X Q (t) base line hazard is clearly a nuisance parameter. 

3 t 

The relative riks e represents the effect of treatment. The 
greater its deviation from 1, the greater is the importance of new 
treatment . 



For the two prognostic subgroups a similar pattern may be represented 

X(t,Z t ,Z n ) =yX Q (t) group with treatment A, node negative 

6 t 

e § rou P with treatment B, node negative 

* n (t) e group with treatment A, node positive 
v w*-> 8 t fti 

\ A '.t; e group with treatment B, node negative. 

The above structure however is considering 6 and 8 to be of 

t n 

similar effect if they are present singly or both simultaneously. 
There is an extension to the model by which one can test the 
effect of both treatment and node^ while one is testing their effective 
simultaneous presence. 

X(t, Z t> Z R ) =y* Q (t) group with treatment A, node negative 

8 t 

*0\t) e group with treatment B, node negative 

x (t) e 6n group with treatment A, node positive 
» 8 - S n 8 I 

X Q (t) e "e e group with treatment B, node negative 

Then if 6j is significantly different from then there is a 
suggestion/ 



suggestion that the new treatment may be more effective for one 
prognostic group than the other. In the above we have dealt with 
binary treatment and prognostic categories. In a case of 3 categor- 
ies of a prognostic indicator say size divided into 3 separate classes 
small, medium and large tumours, an expansion of the concept of 
interaction is possible. Like the example of the node we may have 
alinear interaction of the size with treatment. However due to the 
fact that there are 3 levels of size present we may have various 
quadratic effects acting. That is the larger tumours may be 
behaving in a way completely different to those of small and medium. 
We may then introduce two different sets of covariates Z g = (-1,0,1) 
and Z^ = (-1 , , 1 ) 2 so that a test of size effect may be done in such 
a way that various main effects and interaction effects of size are 
independent . 

In a situation where time may effect the influence of certain 
covariates, we may represent the time interaction by 

Exp( 6l Z 1 + & t .Z,t*) , for t* a function of time. 
A Taylor series expansion of the time dependent effect gives. 

(1 + S t Z 1 t*) . Exp(8 1 Z : ) for (S t . Z r t*) j -^ 

as 6 + for j> y 2 

In the previous model the factors of order j >, 2 have been 
considered insignificant by the Taylor series expansion. Clearly 
other possible situations for detecting departures of specific type 
from/ 
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from the proportional hazard assumption can require a model of form 

( Z 6 t* )^ 

X(t, Z) = A (t) [1 + (Z 1 B t t*) ], Exp ( 6 Z) 



When considering time dependencies the functional form of t* is 
also of importance for an efficient analysis. It may be necessary 
to transofrm t to ~ - . Alternatively if the effect on covariate 
is influenced exponentially with time we use (ln(t) - ln(T)). 
An analgous approach may use a transformation of the time scale 
t* to or 1 scale, so that effects of intervening events such as 
metastatic recurrence may be studied. 

In here we must make a n important distinction between the 
various forms of time dependency which have been considered. 
It can be that a measurement over time like age is considered an 
independent value which affects the survival time. It may be that 
age is considered to have a time scale which is inappropriate under 
the proportional hazard assumption and therefore study of departures 
of particular types based on the functional form of t* may be of 
interest. Finally we may be interested in the study of intervening 
events like the metastatic recurrences. 



In the analysis of the data presented in the next chapter 
we will use a functional form of the r(Z, 8 ) referred to in the 
Cox's paper on the exponential decomposition of the relative risk 
Exp (6 Z). We will return to this topic of time dependency in 
Chapter 7 and 8 where a more detailed study and analysis of 
trial data will be performed. As we mentioned in section (4.3) 
the/ 



the ln(t) is a natural transformation for testing time dependency 
of Weiball form, in a proportional hazards setting. m Chapter 8 
we will relate these topics to concepts of change and random 
covariate effects. In Chapter 7 we will consider vark)us 
forms of the time dependency such as logarithmic or linear time 
dependency. Further we will study effects of intervening effects 
by transformations of the time scale in to binary or 1 scale. 
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CHAPTER 



SIMULATION OF PATIENT ACCRUAL TIME TO RESPONSE. In a clincial 
trial with proportional and non proportional hazard rates . 

In this chapter we will describe a method for generating 
random samples of survival times with a given distributional assump- 
tion. The distributional form of the generated sample will clearly 
play a major role in value of an analysis method. Further we will 
develop a method of producing different levels of censoring times 
as an analogous situtation to that of random arrival of patients into 
the trial, and early analysis when some patients are still alive. 

It is intended that by such an approach a comparative 
study of the generated small samples of survival data may be made 
with varying values of covariates, censoring percentages, sample 
sizes and the hazard rate of cases. 

For reasons of comparison we explain type I and type II 
errors in the context of the present study and finally the results 
are presented and discussed. 

5.1 Generation of survival times. 



In Chapter 3 we described some of the possible 
distributional forms of the survival times. We also presented some 
of/ 
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of the empirical results to show that different patterns of failure 
rates do occur in practice. One major aim for any system of generat- 
ing random samples is that the method should be flexible, so that 
we may produce a range of survival times with a good level of control 
over the many factors under study. 

We will present a manageable method of simulating random 
survival distributions with proportional and non-proportional hazards, 
relevant to failure time analysis. Further for a realistic simulat- 
ion of trials we will develop an approach for accrual and censoring 
times . 



We confine the study to the most commonly used distributions, 
exponential and the Weiball under covariate constraints. The method 
of generation provides a good methodology for producing distributions 
both in the framework of covariates and also in terms of time- 
dependency. However it does not extend to censoring. Later on in 
this section we will describe an algorithm for censorings. 

The conditional survival distribution function of Weiball 
survival in presence of covariates may be presented by 

S(t, Z) = Exp [ - ( pt ) V e Z 6 ] (5.1.1) 
Where v is the shape parameter in the Weiball distribution. Clearly 
at v = 1 „e have a special case of the exponential distribution. 
We thus have the survival time T is always greater than or equal to 
zero, z is a vector of explanatory indicators, g is the vector of 
parameter that eventually has to be estimated and M is a parameter 
for/ 



for "adjusting" the rate of the hazard functions. The conditional 
probability density functions and the conditional hazard function for 
T then follow from (5.1.1). 

-m*LM = % 2) = uv( P t) V - 1 e Z8 Exp[- (lJ t) V e "j 



at 

(5.1.2) 



and 



in making the functions more manageable we use a two stage transform- 
ation, m its present form it is not easy to recognise a probability 
distribution function of the above. However after the transformat- 
ions we will re late the distribution to the extreme value distribution. 
We let Y have a probability density function f^y, . If h(y) is 
either increasing or decreasing in y, then U = h(Y) has the density 
function given by 

F a (u > = % fh- 1 («,] |gj (5 , -,.4, 

A useful method is finding the density function of Y = log T. 
Therefore we use function h(t) , log t, giving h~ 1 (y) = Exp(y) . 

(y)) 

3y = Exp (y) - » < y < co 

Now substituting for t « exp (y) in f(t,Z) and multiplying by 
|Exp(y) | we get 

f(y, Z) = mv ( pe y, v-i e z 6 Exp[ _ (u eY)V 3 gV 

This gives 

S(y,Z) = Exp [ - ( ue y , V e ZB ] 

which/ 



which can be derived from using f(y,Z) = - 3S(y,Z) / 3y 
For manageability we use a further transformation of Y, 

W = Y v - a v+ Z 6 
where a = - log (u) 
The density function of interest is 

h" 1 (w) = H + a _ _Z_6 

V V 

3(h" 1 (w) _ . 2 
3 (w) v 

Then we obtain f (w r Z) by suostituting y = - + a - IL. 

V V 

and multiplying by |- | we have the probability density function of 



w and Z as , 



^ + « - 28 



f(w, Z) = e " a V (e" a e v v ) v " 1 e 28 



» + . - «L 21 + a _ Z6 

• Ex P .[-( e " °e V v , V 8 ]e v v 1 



w _ 7S ^26 

- v( e V V e 28 Exp( - (e y ~» ""> e 28 ] 1 

v 



(e W - Z6 > e 28 Exp [ - (e W " 28 ) e Z 8 ] 



= (e W ) Exp f- (e W ) ] = Exp (w - e") 



(5.1.5) 



The above is an example of the extreme value distributed random 
variable for w, with the distribution function. 

F(w) = Exp (-e~ W ) -••» < w < - (5.1.6) 

Now (5.1.6) is in a convenient form so that the required distributions 
may be generated. 



In/ 
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In cases where there are no covariates present the survival 
distribution reduces to 

Sit) - M Exp[ - ( pt V )] 
and it follows that the generator is 

w 

Y - — + a where u = exp (- a) 

When v is equal to 1 , the Weiball distribution reduces to an 
exponential distribution with the survival distribution given by 

S(t) = h Bap 
and the generating function is y = w + a 

In here we will not discuss the actual values for y , a, 6 and p. 
However later we will mention the actual values that are used in the 
study . 



The extreme value distributed random variables can be 
easily generated using the operation of two logarithms on a set of 
unifromly distributed random variables, between and 1, so that 

W = log (-log U) , for U, uniformly distributed between 

and 1 . 

One result that may be of practical importance although we do not 
use it further in this thesis is the pattern by which the lognormal 
distribution can mimic the standard extreme value distribution. This 
will allow a similar expansion of the methods, so that other 
distribution may be approximated to produce other shapes of the 
hazard functions, using the same procedures. 

Standardised Cumulative Distribution Function s 
X Extreme Value Lognormal 

-2.0 .00063 .0002 

"1-5 .02140 .0196 
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Standardised Cumulative Distribution Functions (cont'd) 



X 


Extreme Value 


Lognormal 


-1.0 


.1321 


.1324 


-0.5 


.3443 


.3471 


0.0 


.5704 


.5700 


0.5 


.7440 


.7423 


1 .0 


.8558 


.8546 


1.5 


.9224 


.9207 


2.0 


.9577 


.9579 


2.5 


.9775 


.9773 


3.0 


.9881 


.9884 


3.5 


.9937 


.9939 


4.0 


.9967 


.9968 



The above extreme value distribution is the standardised extreme 
value distribution with 

S(X) = Exp[ - exp (-1.28254 x - 0.57722)] 
The Lognormal then has the distribution 

S(X) = ( {2V )" 1 f U(X) exp (-H u 2 )du 

*/— CO 

with 

U(x; = 6.52771 Log 1Q (X + 2.74721) - 2.68853. 

We will now describe the present extreme value distribution 

more specifically. We will use the standard probability density 

function of X as a unimodal exp(w - e") , with skewness -1 . 14 ,Kurtosis 

2 

2.4, variance [ H2/6] = (1.282) and the mean -0.5722, which is the 
negative of Eulers constant. 

Before we proceed with a discussion of the distributional 
properties of the extreme value distribution generations for various 
sample 



sample sizes, we mention a few words on the uniform random number 
generator . 



The uniform random number generator used in the computer 

procedures is based on the standard random number generator of the 

Unix operating system version V Library of Programs. The system is 

available on the PDP II Computer at the Medical Computing and 

Statistics Unit of the Edinburgh University. For all of the 

simulations we have used the seventh edition of the Unix operating 

system and its various software on the PDP II computer. The 

uniform random generator asks for 2 initial seeds to begin the 

simulations. We have used values 1 and 2 as the initiators of our 

generations. In order to allow the random numbers to reach stabili 

we proceed with the generation of 500 random numbers and then use 

the 501 st generator as the first effective random number for the 

survival samples. The returning value of the generator is within 

the range of values and 1 . We repeat the generations for the 

various values of sample size and note that the generations conform 

2 

to a mean value of and the variance of - — - — - 

2 12 

for various sample sizes that are greater than 20 over the range 

of values that we examined. 



Prior to proceeding with the generation cf random 
survival samples, the extreme value distributed random variables were 
generated, for different sample sizes. The purpose is to assess 
the capability of the generator in conforming to the above specificat- 
ions . 

10/ 



Standard extreme value distributed random variabl 
with the sample size of 50 



S 



ample No. 


Mean 


Variance 


Skewness 


Kurtos 


1 


-.732 


1.504 


-0.271 


9.581 


2 


-.716 


1.249 


-0.543 


-0.135 


3 


-.397 


1 . 166 


-0.133 


-0.146 


4 


-.727 


1 .241 


-0. 151 


-0.68! 


5 


-.428 


1 .262 


-2.351 


-0.683 


6 


-.495 


1 .058 


0.223 


-1.711 


7 


-.658 


1.417 


-0.446 


5.597 


8 


-.280 


1 .005 


-3. 1C7 


-2.084 


9 


-.391 


1 .150 


-0.730 


-0.331 


10 


-.456 


1 .361 


-0.606 


0.984. 



is 



Sample No 
1 
2 
3 
4 
5 



extreme 


value distributed random 


variables 


sample size of 100. 






Mean 


Variance 


Skewness 


Kurtosis 


-0.581 


1.517 


-3.745 


9.480 


-0.496 


1 .075 


-0. 177 


-1 .505 


-0.454 


1 .412 


-2.540 


9.669 


-0.637 


1.295 


-0.775 


0.545 


-0.568 


1.354 


-0.965 


1.821 



5 Standard extreme value distributed random variables 
with the sample size of 200. 
Sample No. Mean Variance Skewness Kurtosis 

1 "0.597 1.362 -2.187 9.462 



5 Standard extreme value distributed random variables 
with the sample size of 200 (cont'd) 



Sample 


No. 


Mean 




Variance 


Skewnps=! 


Rurcosi 


2 




-0.494 




1 .364 


-1 .476 


3.721 


3 




-0.458 




1 .209 


-0.362 


-0.724 


4 




-0.462 




1.315 


-1 .005 


Z .408 


5 




-0.663 




1.270 


-1 .164 


1 .421 




5 Standard 


Extreme 


value distributed random 


Vul XdUlcS 




with the sample size 


of 


1000 






Sample 


No. 


Mean 




Variance 


Skewness 


rvUJ. LOSlS 


1 




-0.592 




1.260 


-1 .151 


"> G cz£Z 
£ • OjO 


2 




-0.584 




1.288 


-1 .207 


2.384 


3 




-0.575 




1.272 


-1.118 


2.343 


4 




-0.557 




1 .271 


-1 .118 


2.670 


5 




-0.459 




1.188 


-0.678 


2.521 




2 Standard 


extreme value 


distributed 


random variables 



with the sample size of 30,000 
Sample No. 
1 
2 
3 

One important point to note is that the above are random samples 
and that there has been no selection. 

The sample with n = 30,000 clearly shows that we are generating 

the/ 



Mean 


Variance 


Skewness 


Kurtosis 


-0.591 


1 .293 


-1 .184 


2.712 


-0.588 


1 .287 


-1 .125 


2.354 


-0.569 


1.285 


-1 . 159 


2.674 
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the appropriate distribution and therefore asymptomatically all 
moments are stabilised and conform to the theoretical values. In 
so far as the small sample properties are concerned the first 
moment mean is stabilised at n = 100 , variance at 200, Kurtosis 
at n = 1000 and Skewness is the last, to stabilise at before n =30,000 
over the range of values examined. 

For practical purposes we only study small sample 
properties of the statistical methods, therefore is is of interest 
to know how well mean, and variance conform to the theoretical values. 
However for the sake of consistency it is important to know that 
given a large enough sample size the distribution conforms fully, 
as :far as skewness and Kurtosis are concerned with the generated 
population. 

5.2 Illustration of the method of generation . 

Now for the purpose of illustrating the generation of 
survival times in the simulation procedure we define a population with 
the following characteristics. 

Let the hazard be constant and fixed at, exp(-5.2983) giving 

x = 0.005. Allow two parameters ^ and " 2 to have values 0.99 and 
0.49 respectively. Let the covariate indicator vectors Z^and Z^ 
be equally and uniformly allocated to values of to 1 . Finally it 
is assumed that none of the survival times have been censored. 

Since we are assuming a constant hazard we use the 

generating/ 



generating expression as 

Y = W + a - Z 1 S 1 - z 2 6 2 
since there are 2 covariates each at or 1 level,, 4 distribut- 
ions are generated by the subroutine: 

Namely, S(t,0,0) , S(t,1,0), S<t,0,i, and S(t, 1, 1, . These 
survival distributions may be obtained by substituting in the 
general distribution for 

S(t, Z V Z 2 ) = Expf- e" \, e^ 1 &i ' + "» *l\ 

where 6 1 = 0.99 
and s 2 = 0.49 

giving 

S(t, 0, 0) = Exp [-0.005 t] 

S(t, 0, 1) = Exp [-0.005 t. 1.6323] 

S(t, 1, 0) = Exp [ -0.005 .t. 2. 6912] 

S(t, 1, 1) = Exp [ -0.005 . t. 4.3928] 
With a perfect approximation to the above theoretical distributions 
the estimated values of 6 , and | % must correspond to values 0.99 
and 0.49 respectively. The 4 distributions are illustrated in the 
figure (5.2.1)., , table (5.2.1) give the details of the sample size 
of 50 which was generated. 
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2 

2 2 6 

Time Function i=1 i i 



Survival Hazard 



44.0 
30.0 
64.0 
62.0 
270.0 
264.0 
19.0 
16.0 
220.0 
20.0 
65.0 
64.0 
68.0 
7.0 
151.0 
71.0 
16,0 
80.0 
84.0 
357.0 
11.0 
81.0 
79.0 
242.0 
4.0 
71 .0 
30.0 
23.0 
2.0 
18.0 
74.0 
12.0 
77.0 
119.0 
85.0 
0.0 
121.0 
201 .0 
35.0 
2.0 
242.0 
16.0 
48.0 
12.0 
405.0 
385.0 
15.0 
154.0 
11.0 
43.0 



0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 

0.00500 



0.49000 

1 .48C00 

0.49000 

0.49000 

0.49000 

0.00000 

0.49000 

0.99000 

0.00000 

0.99000 

1 .48.000 

0.99000 

! .48000 

1 .48000 

0.99000 

0.99000 

0.99000 

0.00000 

0.99000 

0.00000 

1 .48000 

1 .48000 

0.49000 

0.00000 

0.00000 

0.49000 

0.00000 

0.49000 

1 .48000 

0.49000 

0.99000 

0.99000 

0.99000 

0.49000 

0.49000 

0.99000 

0.00000 

0.49000 

0.99000 

1 .48000 

0.99000 

1 .48000 

0.99000 

0.49000 

0.99000 

0.49000 

0.49000 

0.99000 

1 .48000 

0.49000 



w 

-1 .00926 
-0.39407 
-0.63759 
-0.67770 

0.79077 

0.27990 
-1.84074 
-1 .53203 

0.09956 
-1 .28939 

0.35743 
-0. 13930 

0.41424 
-1 .77981 

0.71370 
-0.03391 
-1 .51903 
-0.90993 

0.12893 

0.58049 
-1 .34890 

0.58743 
-0.42989 

0.19377 
-3.84875 
-0.54172 
-1 .87210 
-1 .64768 
-2.86412 
-1.90699 

0.00538 
-1 .78716 
0.04566 
-0.02331 
-0.36473 
-5.07574 
-0.50017 

0.49686 
-0.75237 
-2.81221 

1 .18442 
-1 .00116 
-0.42716 
-2.26778 

1 .69616 

1 .14530 
-2.08965 

0.73326 
-1 .40793 
-1 .04437 



3.79906 
3.42425 
4.17073 
4.13062 
5.59908 
5.57822 
2.96758 
2.77629 
5.39788 
3.01893 
4.17575 
4.16902 
4.23256 
2.03851 
5.02202 
4.27441 
2.78929 
4.38838 
4.43724 
5.87881 
2.46941 
4.40575 
4.37843 
5.49208 
1.44957 
4.26660 
3.42621 
3.1606? 
0.95420 
2.90133 
4.31370 
2.52116 
4.35398 
4.78501 
4.44359 
•0.76742 
4.79814 
5.30518 
3.55595 
1 .0061 1 
5.49274 
2.81715 
3.88116 
2.54053 
6.00448 
5.95361 
2.71867 
5.04158 
5.41039 
3.76394 



1 

0.00000 
1 .00000 
0.00000 
0.00000 
0.00000 
0.00000 
0.00000 
1 .00000 
0.00000 
1 .00000 
1 .00000 
1 .00000 
1 .00000 
1 .00000 
1 .00000 
1 .00000 
1 .00000 
0.00000 
1 .00000 
0.00000 
1 .00000 
1 .00000 
0.00000 
0.00000 
0.00000 
0.00000 
0.00000 
0.00000 
1 .00000 
0.00000 
1 .00000 
1 .00000 
1 .00000 
0.00000 
0.00000 
1 .00000 
0.00000 
0.00000 
1 .00000 
1.00000 
1 .00000 
1 .00000 
1 .00000 
0.00000 
1 .00000 
0.00000 
0.00000 
1 .00000 
1 .00000 
0.00000 



1 .00000 
1 .00000 
1 .00000 
1 .00000 
1 .00000 
0.00000 
1 .00000 
0.00000 
0.00000 
0.00000 
1 .00000 
0.00000 
1 .00000 
1 .00000 
0.00000 
0.00000 
0.00000 
0.00000 
.00000 
0.00000 
1 .00000 
1 .00000 
1 .00000 
0.00000 
0.00000 
1 .00000 
0.00000 
1 .00000 
1 .00000 
1 .00000 

.00000 
0.00000 
0.00000 

1 .00000 
1 .00000 
0.00000 
0.00000 
1 .00000 
0.00000 
1 .00000 
0.00000 
1 .00000 
0.00000 
1 .00000 
0.00000 
1 .00000 
1 .00000 
0.00000 
1 .00000 
1 .00000 



Table (5.2.1) Actual survival times generated from distributions of 



figure (5.2.1) 
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The values for the columns of the table (5.2.1) correspond to 
Y = W + a - Z 1 6 1 " z 2 S 2 

As an example for the first row we have 
a = - In (0.005) = 5.2983 

Z 1 8 1 + Z 2 B 2 = °' 49 
W = -1.00926 

Y = -1.00926 + 5.2983 - 0.49 = 3.79906 

Time = Int [Exp (3.79906) ] = 44 
We can plot the above data using life table analysis methods to 
derive cumulative survivals (as mentioned before) . The purpose at 
this stage is not to do a detailed comparison of the estimation methods 
bur rather a general overview of the survival generator. 



The following 4 tables give comparisons of the cumulative 
survival estimates using the product limit estimation, as in Figure 
(5.2.2) and the theoretical value of the exponential distributions 
in Figure (5.2.1) 
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Time Cumulative Survival S.E. 

4 -8750 .1169 

30 .7500 .1531 

80 .6250 .1712 

121 -5000 .1768 

220 .3750 .1712 

242 .2500 .1531 

264 .1250 .1169 

375 .0000 .0000 



Table (5.2.2) 



Time Cumulative Survival S.E. 
12 .9375 .0605 

15 -8750 .0827 

18 .8125 .0978 

19 -7506 .1083 
23 -6875 .1159 

43 .6250 .1210 

44 -5625 .1240 
62 .5000 .1250 
64 -4375 .1240 
71 .3750 .1210 
79 -3125 .1159 
85 -2500 .1083 

119 .1875 .0976 

201 -1250 .0827 

270 -0625 .0605 

385 .0000 .0000 

Table (5.2.3) 



Theoretical S(to,o ) 
.9801 
.8601 
.6703 
.5460 
.3328 
.2981 
.2671 
.1533 



Theoretical S(t f o,1) 

.9067 

.8847 

.8633 

.8563 

.8288 

.7040 

.6983 

.6028 

.5431 

.5601 

.5247 

.4997 

.3780 

.1938 

.1104 

.0431 



Time Cumulative Survival S.E. 

.9333 .0644 

12 .8667 .0878 
.6 - 

16 .7333 .1142 

20 .6667 .1217 

35 .6000 .1265 

48 .53.33 . 1288 

64 .4667 .1288 

71 .4000 .1265 

74 .3333 .1217 

77 .2667 .1142 

84 .2000 .1033 

131 .1333 .0878 

134 .0667 .0644 

405 .0000 .0000 

Table (5.2.4) 

Time Cumulative Survival S.E. 
2 - 

2 .8182 .1163 

7 .7273 .1343 

11 

11 .5454 .1501 

16 .4545 .1501 

30 .3636 .1450 

65 .2727 .1343 

68 .1818 .1163 

81 .0909 .0867 

242 .0000 .0000 

Table (5.2.5) 



Theoretical S(t,1 ,0) 
1 

.8508 

.8063 
.7640 
.6244 
.5241 
.4226 
.3846 
.3694 
.3548 
.3229 
.1310 
.1259 
.0043 

Theoretical S(t,1 ,1) 

.9570 
.8574 

.7853 

.7036 

.5174 

.2398 

.2245 
.1687 
.0049 
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5. 3 Generation of Censored Survival Times . 

As we described earlier, in the first step we generate 
a survival distribution S(t) according to a set of covariates and 
the shape of the hazard function. A sorted plot of this sample 
will show a survival pattern as in figure (5.3.1). It is assumed 
that all patients enter the study at time zero. In this sample 
there are no censored cases. In terms of clinical trials it is 
assumed that sufficient time has passed since the start of the trial 
to allow an observation of full length of survival. 

In real data from a clinical trial the situation is slightly 
different. Patients do not arrive simultaneously into the study. 
Patients are not observed for a full length of survival , either because 
they drop out of the study, or analysis is performed at a time that 
not all patients have had the chance of producing a complete survival 
time. 

First we consider the problem of arrival or accrual period. 
This kind of follow-up study is composed of two periods, accrual and 
follow-up period. The accrual is a period to allow a sufficient 
number of patients entre the study so that a reasonable statistical 
comparison may be made of the patients. Thus the accrual period 
in such a study becomes a function of the required sample size and 
the rate of arrival of patients. In effect the accrual period 
is prejudged by the value of treatment difference which the study 
is designed to detect. In general it may be assumed that all 
patients entre the study uniformly, and that there are no trends or 
seasonal patterns present in the covariates according to the accrual 
period/ . 
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period, In practice randomisation o£ the pat . ents pcovi(jas 
condition for the „ ain traatments . „ owevar oonditions ^ ^ 
hold for «. prcgnostic £a=tocs _ ta ^ e cm ^ ^ sitHation 
"here by chance younger patients are entered fa ^ ^ ^ ^ 
a study and older patients in the seoond year. 

Another period we consider is the follow-up time, m 
so far as the dinical trial procedure is concerned a g ood clinical 
trial provides conditions and procedures so that all patients may 
he followed-up and that at the end of study or time of the interim 
analysis, survival status is recorded for all of the patients. Such 
a condition g uara„tees that censoring if it occurs, is only ^ tg 
reasons of treatment, disease and patients and not due to the 
procedures of follow-up and withdrawals. m m simulations of ^ 
Chapter we assume that the ahove condition holds. follow-up period 
is in practice often dependent on many external factors, due to 
constraints from management of patient care. Purther it is custom- 
ary to do a number of interim analysis of the data. For an 
efficient unambiguous analysis there should he no crossing patterns 
Present in the survival rates. » „ often tha case „ 
and any variability in the MeI o£ su „ ivors ^ ^ ^ 
the survival rates of the subgroups of patients. 

There is one complicating factor in practice which 
arises in „u ltiple £ailure studias _ Jf ^ ^ ^ ^ ^ 

form of failure responsible for the reduction of the cases, the 
usual approach of analysis is then by classifying one set of end 



points/ 



Points as ^ Md another set as oensored _ su=h s 

quires a different ^ o£ genetation of 

This method can also be used in . „ ay ^ ^ a diffecMt 

-te to ta present for each covariate set. S o that the P roblem of 
lost to follo„-u P for dif£ecent oroups ^ ^ assessaj _ Huhin 

the P rese„t simulations „e assume that there U only one single causa 
01 de " h ~* « * due to the f a= t that 

take place. 

In the introduction „e mentioned tha various forms of 
censoring a„a stated ,„at „ tri ais often „. ara interested in random 
censoring. by „ hich arrival Q£ ^ ^ ^ ^ ^ ^ 

an accrual P eriod and thus any censoring at tine o£ the analysis ^ , 
random censoring. This latter „ ^ ^ ^ ^ ^ 

tha generation or the random sam P les. ^ as ^ ^ 
later rather than filing tne time of ^ ^ ^ ^ ^ 

tha a„a ly sis to vary slightly so that „ oan have fixed oensori„ g 
Paroe„ta g as at time o £ each analysis. Th us „e solarise. 

Total le„ g th or study . Accr ual period ♦ Fo llo„-u P 

period. 

For each individual patient „ have a time tj „ h ioh is g e„er.ted 
fa- • (*) . and a tine fro,, start of the trial to th. entry of 
Patient i. We „ lu let ^ da „ ote ^ ^ ^ ^ ^ 

distributed random variabla between o and A. 

Thus the total length of study for the patient i is - 

We can now transform the figure (5 3 1> «, e . 

y 6 mto the figure (5.3.2) 

in/ 
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in which the accrual period is also represented. Now we can produce 
the figure (5.3.3) in which the data is sorted accrding to the values 



of a. + fc 



As we mentioned before most trials are analysed in one 
of the following two situations. Either a final analysis is 
Performed prior to the minimum sufficient time for producing a 
complete survival time, or that the clinical trial results have 
been formed and discussed at an interim stage. m both cases we 
can generalise to the following: every trial analysis has a fixed 
value I which is a point in the time of the study when survival 
information prior to it are complete and all possible events after 
this time are taken to be censored. 



A crucial factor before the start of a trial is a decision 
on the l ike iy number of events. Two factors that are in practice 
of considerable interest in design of a trial, are the accrual 
Period and the number of patients in the study. Based on these 
assessments a decision is finally made on the appropriate times at 
which interim analysis and the main analysis may be performed. 
Now it seems proper to summarise some of the generalisations 
that have been made in the course of our discussions. 
(D It has been assumed that censoring is synonymous with non- 
informative censoring. Therefore there are no situations that 
Patients leave the study due to side effects or other forms of 
failure. Thus patient censoring ti.es are distributed in the same 
manner between subgroups. An example of the violation of the above 
assumption would be a higher dropout rate from an arm of a trial due 
to/ 
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to a particular prognostic indicator or treatment. 

« -rther U t ne relative rat, of raiXnre is not constant terpen 
groups. ^ la the re exists . ^ iep ^ ^ ^ ^ 

their corresponding raiinre rates, then tne level o £ censoring mu 
follow a trend in time. That „ ^ „ ^ 

censored at any given time a. 

ven time. Ss an example if one subgroup has a 

Mgber failure rate at the end of ■ study , a measure or censoring 

Percentage will give a different relative difference at early and 

late stages of time scale. Aternatively no time 

»*ly a constant relative rate of censoring. 

The relation between <„ ^ „ u a „ interesti „ g ^ ^ 

represents the relationship of problems of competing risxs end the 
trme dependency of m na2ard ^ ^ ^ ^ ^ 
generation of cen30cin9 ^ „ do ^ ^ ^ ^ ^ ^ 

«W C» - uowever by use cf differing failure rates „ „in st udy 
-e effects of time dependency. :„ . descriptive manner we consider 
HI to be a causal situation within %Uch „ have , ^ ^ ^ 

• "use of death. „ s i„, the example given in m a good analysis 
may indicate a li„ k u it exists ^ , 

« a particular cause of death. x„ the m ,„ ^ ^ 
are describing the failure rate as a form of a function of 

Thrs function of time however need not be of a m - • 

nut oe of a continuous form as 

-scribed in the example given in m . „ ^ ^ 

o* («» ana „ ^ their examples can be exchanged at times in the 

language of the other, with ,5M • 

er, with (2) being slightly more flexible. We 

can describe cause of death in rii ,„ * 

in (1) in terni s of time dependency of 

(2)/ 



(2) by letting time dependency be a parametric function of the type 
failure. An example is the effect of old age or survival distribu- 
tions. For reasons of consistency of the conclusions one may use 
an approach by which deaths suspected of old age are dealth with as 
censored, or alternatively define a functional, form of the old 
age and incorporate this function into the relative risk as a diagno- 
stic check of the relative risk. As we ported out in this 
section we will concentrate on the simulation of time dependent type, 
and later in Chapters 7 and 8 we will consider introduction of 
types (1) and (2) in appropriate application with vaxious functional 
forms of time dependency. 

Finally in this section we discuss some of the points 
regarding Censoring times. a purpose of this study is to evaluate 
the power of different tests according to their level of censoring. 
An analytical assessment is impossible, thus we require some criterion. 
Such a criterion must be general enough to be relevant to real life 
practice and thus easy to draw relevant conclusions from. m the 
next section we will discuss such a criterion. However on the point 
of censoring the accrual period and follow-up period both can be 
thought of as some form of fixed variables and thus we can generate 
different levels of censoring according to the relationship between 
them. 



As an alternative to the above we can fix the level of 
censorings. Thus a m censoring in a sample of 50 implies that 

the interim time is somewhere between (a^ + fc^j and + fe ^ 

see figure (5.3.3). Again since we are only considering a fixed 
percentage/ 



Percentage value for the um of censoring « inteti „ -rtf->f ln 
oase of ties being present, the axact ^ Qf ( ^ ^ ^ 
a uniform distribution between 45tn and ^ ^ ^ 

accrual time plus survival tta. a£tec this ^ .„ ^ ^ 
censored. 



The procedure is thus fixed by a sat of covariates, the 
percentage of censoring, a fixed value of hazard rata and a fixed 
valua of accrual period for tha sample . » „ inportant t „ „ ote 
that tha value of hazards must he fixed so that reasonable survival 

"* "^"^ S1 " U «^ — of aoorual period also 
has to be adjusted so that a raalistio sample is generated. Por 
the following simulations we «, the base line hazard at a constant 
value of E xp,- 5) . o. 00674 , „. th ^ ^.^ Qf 5Q ^ 

(which », be considered as months,. For each single simulation 
MM of the above values m repeat the simulations 300 times, 
this value of 300 is set to be fixed for all simulations presented 
in this study. 

A widely held view among statisticians involved with the 
design of clinical trials is that, the sample si ze and power assess- 
~t are the most crucial factors in proceeding with a scientifically 
sound trial. This scientific, stand is in practice often confronted 

with management constraints that eventually lead to a form of 

compromise in the design of trial. 



In/ 
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In the introduction we mentioned some of the drawbacks and 
difficulties in the analysis of data with small samples in the 
presence of a number of covariate effects and a process of time 
dependency. We termed such effects in general to be interrelations. 
Later we will analyse real clinical trial data with an exploratory 
emphasis and use of such interrelations, in the later chapters we 
we will discuss functional, forms of time dependency. we will now 
deal with small sample properties of the Cox's proportional hazard 
models. The main aim is to establish a criterion for a comparative 
study of small sample sizes, under trial design constraints. The 
factors that we have taken into account in the generation of the 
survival samples have been chosen with a particular emphasis on 
crucial design factors in a realistic clinical trial. These factors 
are accrual period, censoring, sample size and interim analysis 
time. in the process of the generations we have constantly adopted 
a generating procedure that we have considered useful for a range of 
applications in the later chapters. However a more detailed theoret- 
ical study based on asymptomatic properties can take a route different 
from the one we have taken. Areas which may pose interesting 
directions are situations of competing risk generations and stratified 
analysis with varying accrual periods. it seems that study of small 
sample properties of methods as opposed to. a more theoretical study 
of the asymptomatic properties of the method of analysis is a useful 
approach in a better understanding of the scientific design constraints. 
S.D. Silvey (1975) discusses in detail ...mail sample properties of 
statistical tests with simple hypothesis (only one estimator) and 
composite hypothesis (with more than one estimator or nuisance 
parameters) . Such properties are related to a function of 
different 



different errors that take npr*. <„ 

take part in any statistical hypothesis, 

-ely, Type I error and T yp e „ error, fay & ^ 

positive, ana E8 (falsa nagative) ( ^ 

as E „ and E6 to avoid confusion with 8 covariates, . 

xn this work „e are dealin, „ ith sarvival ^ ^ ^ 

definitions oan he g eneralised to aU statistical tests. » £act 
«. . elinical trial setti„ 9 another bran=h „ f ^ ^ ^ ^ 
are hased on a simple pcopoction Md ^ ^ ^ 

«n provide an the necessary information re g ardi„ g . ne „ traatment . 

- tests hased on time to failure are playin, an increasin, 
-le. such tests are hased usually on produci„ g see f orn of appcoxi . 
-ion to the prohahility distrihution « ta ilure times or U* tahles 
xn the chapters : and 3 „e made the necessary distinction hetween the 

distribution free t-pc=,-c. 

tests and parametric approaches that are related 

to life tables and discussed their properties m * h 

r femes. i n the evaluation of 
the sample size however at starh n f *. 

start of treatment, some form of parametric 

assumption .ust be made. The most common is to assume an exponential!, 
attributed survival time, with the proportion of survivors appro.- 



mated by 



S(t) = e ~ Xfc 



- patients with the mean survival time „. and the ha.ard estreated 
* « . »hich is asymptotically ^ ^ ^ 

analogous to the results of the Chapter , on the discussion of the 
exponential distrihution. M o£ the ^ ^ ^ ^ 

studies were based on the studv of 

study of the asymptotic properties of the 

proportional hazards model and t-h a 

model and the exponential approximation. Such 



comparisons/ 



comparisons are useful in practice due to conflicting benefits of the 
methods. For example, although the exponential analysis is more 
efficient given that the sample is generated from an exponential 
distribution, the proportional hazard model has a far better robustness 
Property, when the data is not exponentially distributed. Various 
authors have discussed asymptotic properties of the Cox's approach. 
Kalbfleish (1974, discusses the asymptotic efficiency for a single 
covariate model. Efron (1977, discusses conditions for full asympt- 
otic efficiency, Kay ( 19 79) provides a comparison of two covariate 
models with the exponential and Kalbfleish and Macintosh (1977) expand 
the results to time dependent situtation. The results indicated that 
for covariates not dependent on time /t he estimations based on ranks 
is fully efficient for 6 = and has gQod prop&rties ^ $ + q> 

For the case of time-dependent covariates the asymptotic properties 
are related to the ratio of the hazard rates. Prom a different 
POsit.cn properties of the proportional hazard models have been 
studied in relation to the logrank test. (Crowley (1974, and 
Tarone and Ware (1979, .) It is shown that asymptotically the Cox's 
method can lead to the logrank test. Lustbar (19*3, derives the 
Wilcoxon test as a special case of Cox's model with a time dependent 
covariate. it is shown that the twc fully distribution-free tests 
in fact differ only in their choice of weights as functions of the 
number of cases at risk. a .ore detailed discussion of those tests 
was given in Chapters 2 & 3. 

Although the above studies are useful in allowing some form 
of comparison between the methods, they do not allow comparisons for 
differing sample sizes, censoring rates and censoring methods. The 
point/ 
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point is made by Oaks (,981), that in small sample studies the 
expectations may be different and peculiarities may be present. 

In the study of the proportional hazard models and the 
Parametric models, two tests have been used in general. One is the 
maximum likelihood estimation with asymptotic normality assumption and 
the other is the likelihood ratio test. These results are given in 
more detail for parametric methods in Chapter 3 and for proportional 
hazards in Chapter 4. 



It seems proper now to follow the study of small sample 
properties in the following directions: 
(D Study more than 1 covariate with constant hazards. 

(2) Study non-constant hazards. 

(3) Study non-proportional hazards. 

In the first instance we study the different methods on 

exponentially generated samples with two covar.iates, firstly, as a 

matter of comparing relative power of the small samples and secondly 

as an expansion of the , covariate study. As we mentioned earlier 

two types of error are of interest. Now we develop these definitions 

so that they may be used as a criterion for the comparisons. Type I 

error represented by is under the control of statistician at the 

end of study and type II error, E 6 is dependent on sample size and 

the value of the covariate. The hypothesis of special interest for 

all practical reasons is that of ^ = . The power of this test 

is then noted for the varing levels of B] and 8, over the. different 

simulations. Another hypothesis we consider is for a composite test 

of ( V b 2 ). The main purpose for this test is a theoretical one and 
may/ 



may be useful in exploratory stepwise regression techniques. The 
theoretical basis of this point will be made clear later. The rate 
of acceptance of the null hypothesis as the actual « ^ values 

differ fro. zero will give a measure of type II error! Jmely it is 
a function of the proportion of times that we may wrongly accept the 
null hypothesis when it is false. At values of ^ ^ ^ ^ ^ 
however, the same proportion is a function of the type I error which 
has the proportion of times we wrongly reject the null hypothesis .hen 
it is true. The last statement in computational terms may be 
represented by an equivalent rephrasing in which value of , U fixed 
at B Q1 and value of ^ is set to Sq2 and we test the M as 

( 6 1 6 2 ) =( 6 01 S 02 ). The power of a test is a function of the alter- 
native hypothesis and is related to ^ , in the following way , The 
sample space of an observation in any test may be divided into two 
regions. One region is called the acceptance region and if the estimates 
fall into this space we accept the null hypothesis. The rest of the 
space is called the critical region. 



Thus - 



Prob( estimator falls within the critical region\H , = ,_ E = 

' 3 

Power. 

and Prob( estimator falls within the acceptance region \ V = E& 
similarly for the null hypothesis. 
Prob (estimator falls within the critical region V 



H ) = E 
u a 



So far we have defined the power and acceptance region in 
terms of the null hypothesis and the alternative hypothesis. Going 
bac k to the opening section of this chapter we rephrase by saying that 
in general we see* a critical region such that the power is as large 
as possible. Then in addition to the control of probability of 
Type/ 



Type I error at E ^ we shall have minimized the probability of 
Type II error at E 

B. 

These definitions in terms of the survival analysis are 
always further complicated by the fact that we are only interested in 
testing a subset of the estimators that define our parameter space of 
the critical region and thus we always deal with a composite, hypothesis 
as opposed to a sinple hypothesis where all the distribution is fully 
defined. This point in general is related to the effects of the hazard 
functions in the estimation of the relevant covariate estimators. 

The above points regarding the composite hypothesis for 
small samples is mainly a problem of illustration in here rather than a 
theoretical one. Neyman and Pearson (1933) justify a method in 
testing a single hypothesis against a simple alternative. That is 
if we are choosing between two completely specified distributions 
then problems of finding a best critical region is simple and they 
provide a solution. Further results of Lehman & Scheffer (1950) 
permits us to reduce the above problem of finding a most powerful 
region for a composite hypothesis to a familiar problem of finding 
a best critical region for a simple hypothesis. 



Now we illustrate the problem for a simple case as that 
of obtaining an area of overlap of the error distribution of the 
estimator and the sample parameter distribution. Figure (5.4.1) 
represents the error regions for a one sided test. The null 

hypothesis relates to the estimator with N( v | , and the alternative 

l8N( "l' I * " If we ad °P fc a significance legel of E for H then 

a ' 

chances of type I error is ^ / , that we obtain the wrong conclusion 
when/ 



when H is true. 



The possibility of obtaining a type II error is E , that 

6 

is opting for the wrong conclusion when H Q is false. The actual 
value of the percentage of the rejections is clearly dependent on 

the form of the hypothesis that we adopt, and the actual value of 
the coefficients associated with the covariates. These will produce 
an indication of type I and type II errors. For a single covariate 
situation we have. 



H : =0 

u Conclusion 

3= 



# => a. false Type II = E 



Reality = => H Q true T- E q jrype iVe 



3 



1 " E s = 

Power 



In the above the percentage of rejections are represented by the 
number of cases that fall into the second column of the table. 
Given that the null hypothesis H Q : 8 = is true, we obtain a 
measure of E a , the observed significance level. Given that 
the null hypothesis is false we obtain a measure of the power of 

the test, which is a fuction of sample size, censoring percentage 
and the magnitude of the coefficients of the covariates. 

The final remark on the method of simulations relates to 
the allocation of the covariates in each sample generation. As 
was remarked earlier the variance covariance matrix of the covariates 
Plays an important role on the type II error. It is well known that 
if different subgroups are numerically divided in an equal manner then 
the efficiency is at maximum. Further the covariances between these 
values play an important role, in that, depending on the value of 
each/ 
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each covariate a high correlation can reduce the efficiency. 

5.5 Description of parameter range for the trial simulations . 

In the generation of the [Z^, Z.) matrix we use the same 
random number generator as before. We randomly allocate -1, and 
1 values as a dichstomos function to and % 2 for each patient. \ 
obtain these values by dividing the to 1 range of the uniform 
random numbers into appropriate segments. Thus there are 4 sets 
of patients in the data with the Z^nd Z^alues set to (-1, -1), 
(-1 , D , (1,-D and (1,1) . 



A well designed trial would allocate, equal numbers of patients 
to each arm of the trial. Any other covariate set is not usually 
controlled. All other uncontrolled effects due to a large sample size 
at times do level out in terms of treatment effects and within each 
type approximately equal numbers are usually allocated to each arm of 
a trial. The importance of equal subgroups is mainly noticed in the 
power of the tests. Tests usually are at their maximum efficiency if 
they are composed of equal subgroups. 

In the generation of the covariate sets we use a uniform random 
number generator. The covariate sample generations however are fixed 
so that for a given sample size every simulation is composed of exactly 
the same covariate sets. Within the sample set however we intend to 
balance the treatment effect so that there is a 50:50 likelihood of 
allocations to a particular treatment. For the covariate set a diff- 
erent ratio of the two covariate values are used, so that we do not 
have a symmetrical relationship between B and 3 Tnus a particular 

set/ 



set of values of , B.,) = ( a ,b) does not necessarily correspond 
to a power efficiency value for (B,, = (b ,a) . The consequence 

of this effect is similar to the use of a lower sample size for * 
in comparison with B. 



2 



At this point a few remarks are needed regarding the different 
possibilities of the generation of the covariate effects. One method 
of generation of the covariate effects would be to allocate a different 
set of 2 1 , z 2 variables at each simulation. Such a method implies 
that any power assessment is complicated by the sample covariate variab- 
ility. An alternative is to fix the Z^ Z., variable set for any 
required proportion within the covariate categories and treatment 
groups. The resultant consequence is that the final results are 
conditional on the generated proportions within the corresponding 
treatment and covariate groups. Bryson and Johnson (1981) discuss 
a method for the generation of the earlier approach and point out 
some of the theoretical problems with generation of monotone likelihoods 
in such generated samples. However, this problem could be avoided 
since it is realistic to add a restriction within generated simulations 
so that no subgroup should be generated which contains less than say 
10% of the total number of sample size. The rest of the procedure 
would then be confined to dividing the uniform random distribution 
range from to 1 into the relevant segments as required. However, 
we are restricting ourselves to a fixed covariate set for all 
samples and so the above problem does not arise. The method for the 
generation of the 40:60 ratio prognostic covariate set, Z y and the 
50:50 ratio treatment set , Z £ , is to divide the uniform random 
number scale to the following categories. We let generations corres- 
ponding/ 
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ponding to uniform random numbers between and 0.4 to be the 
low level of and values of random numbers greater than 0.4 and 
less than 1.0 to correspond to the high level of £ . Further we 
subdivide the two parts of the range of the uniform random number 
corresponding to high and low levels of Z y into two equal parts. So 
that within to 0.4 range there is a 50:50 chance of allocation to 
high and low levels of ^ and within 0.4 to 1.0 there is a 50:50 
chance of allocation to low and high levels of treatment effect. As 
an asymptotic property of the sample the ratios of the margninals of 
the treatment and covariate indicators will then approach the required 
ratios. As we pointed out earlier the major emphasis is on small 
sample properties and although the above approach may be justified under 
certain theoretical conditions, in a simulation of a clinical trial 
it is sufficient to condition our results on a generated sample that 
conforms fully with the required ratios. 

Now we summarise the properties of the generated sample and 
the value of each parameter. 

The random variable W has the standard extreme value distribution. 
In terms of the survival times it is related to it by the function. 

i - k + a _ _ hli 
p p p 

Where Y is the log of survival times. (in the process of derivation 
of the above Weiball generating function we have used V in place of p) 

a = 5 giving X = 0.00674 in X = Exp (-a) 

8 1 values range contains (-1, -.5, -.2, -.1, 0,.1,.2,.5, 1) 
B 2 values are (0, .1, .2, .5, 1) 

P is/ 



p is varied from Exp (-.3) to Exp(+.3) 

For the censoring patterns we distinguish between accrual times 
and the survival times. The survival times are generated by the 
above function. The accrual times are uniformly distributed between 
and 50. The levels of censoring are fixed at 0, 5%, 10% and 30%. 
The significance levels are set at 0.05, 0.01, 0.005. The sample 
sizes vary at 25, 50, 100. We set the null hypothesis to be 
Ho : B 2 = and later 

H ° : (e i' S 2 ) =(0 »°) and vary the values of ^ &B 2 
in the region that we mentioned. 

The low end of the magnitude of J S 1 J and |fi^J where g = S 2 = 
The power represents the type I errors. We repeat the simulations 
for the one sided alternative hypothesis which is the necessary 
condition of some clinical trials. The above range of sample size 

and censoring levels form the complete simulated sets. However, if 
the results are at times very close to each other we will only comment 
on their overlap. 

5.6 Discussion of the simulation results . 

In the first instance we refer to figures (5.6.1) to 
(5.6.6) in which a representation of the power efficiency is given 
for the null hypothesis of Ho : e 2 = 0. Clearly by the figures 
we obtain almost parallel lines for the range of the various covariate 
values. In fact the main determinant of the efficiency is the value 
of the S 2 magnitudes. At the & 2 values equal to zero we obtain a 
representation of the type I error in all cases, and this value is 
consistant over a range of factors. The most striking feature of the 
results/ 
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results of the above null hypothesis is the consistency of power 
values of Cox's test regardless of the value of the B ] covariates. 
Clearly as we may expect the efficiency of the tests do deviate to 
seme extent according to the value of sample size censoring and the 
significance level, however, none of these factors seem to effect the 
lack of influence of covariates in the power of treatment effect 
tests. This finding is clearly in contrast with a view expressed by 
C.L. Chastang (1983) where it is reported that the efficiency of treat 
ment effect is dependent on the value of prognostic effects, even 
when it is not included in the model. We will return to this hypoth 
sis of treatment effect later in this chapter when we consider 
alternative parametric models in the study of proportional and non- 
proportional hazard distributed samples. Now we will consider the 
results of the tests of the simple treatment effect hypothesis in more 



detail. 



At the value of ^ = , and n = 25, a = 0>05f we have a 
separation in censoring levels of almost 7% in power over the range 
of 8, values, (Figure 5.6.1). An increase in the sample size to 

50 diminishes the separation of the 0% censoring and 30% censoring 
(Figure 5.6.2). At ^ = we have a difference of 3% for the range 

of B 1 values. At & 2 = .5 we have a separation of almost consist- 

and 5% over the range of S 1 values. 

A point to note is that the decline in the value of power 

of tests due to censoring seems to be affected by sample size to some 

extent. At the higher sample size of 100 the separation between the 

0% censoring and 30% censoring at value of ^ = .5 are almost 4% 

(Figure 5.6.3), while the same separation in n = 25 is 7%, Figure (5.6.1 
When/ 



When we consider a lower significance level of a = 0.005, the 
separation between values of censoring levels declines so that 
at n = 100 and 3 2 = .5 the 0% censoring and 30% censoring have 
a maximum separation of 3%, Figure (5.6.6). The same separation in 
power for n = 25, Figure (5.6.4) is 6%. Up until now we have dealt 
with simple tests of hypothesis, now we discuss a set of power curves 
for the composite test. The following simulations have a slight change 
of emphasis. The previous simulations asserted the power of tests for 
a practical assessment of the 6, ■ ^ values in a trial. What follows 
ia presented for theoretical interest and completeness. Tolley (1978) 
discusses a group of nQn parametric feestg in survival maX ysis where 
a composite test of hypothesis may be of interest. Such tests have 
certain computational advantages when dealing with a large data set 
and a stepwise variable selection is adopted. The results of 
Tolley imp i y that that large sample distribution of a composite tesc 
has a chi-squared distribution with g degrees of freedom. The 
value of the test statistics is then given by: 

Q q = Q (r> " V - q) 

Where there are (r) concomitant variables in the fuller model and 
(r-q) in the simpler model. m a more complex hypothesis with 
Ho ; c 0= where C is a (q x r) contrast matrix the value of Q 
is then: 

Q g = u- (o, r 1 (o, c t c r\ 0) C r 1 c r\o) u (0) 

where as in the notations of section 4.4 on Cox's method we consider 

U(0) as asymptotically normal with zero mean vector, covariance 
matrix 1(0) and the test statistics; 

u ' (0) f I _1 (0) ] u (0) 

We will next present the results and show that the Cox's method 
has / 



has good, predictable small sample properties where ^ and ^ are 
independent. In the simulation again we use a 50:50 allocation of 
Z 2 values anda 40: 60 allocation of tj values. where %y is taken 
to be a prognostic effect. One form of trial that has been used 
to some extent recently is based on 2 x 2 factorial trials. Such 
designs by randomisation will allocate 50:50 ratio to both Z y and % 
indicators. Although the ratio of the simulations in our results' 
are different from the retirements of a 2 x 2 trial the good properties 
of the composite test efficiency may be attributed to the suitability 
of the Cox' s method when used for simple tests of 2 x 2 trials. 
Thus for more conclusive results in this respect, simple tests based 
a symetric 50:50 proportion of binary variables of z and Z are 



on 

needed . 



In the comparisons of the small sample properties of what 
follows we will use a few terms ;hat need an explanation. The term 
maximum deviation is used when two power curves that are compared have 
a similar pattern and thus we only report the maximum deviation between 
the two graphs, since it is the most suitable descriptor . Relative 
efficiency is used when two single simulations are compared and is 
the difference between them. Finally we use hhe term balanced in 
situations where changes of censoring or sample size does not effect 
relative efficiency by a major degree. 

Initially we concentrate on the situation of 8 ] > and 

B 2 > with varying levels of significance of type I error, censoring 

levels and sample size. La ter we will i ook at the situation of ^o 

and finally at non-proportional hazards. m the final part we consider 
both/ 
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both increasing and decreasing non proportionaity with positive and 
negative values of & ^ . When we discuss efficiency or power of 
tests it must be noted that in the interesting situtations we are 
dealing with tests that have power less than the full efficiency of 1. 
In the figures of power representation often a pattern of converging 
power curves appear. Up to a particular value of (6 1 3 ) set the 
tests deviate from the entity they are estimating. However at 
higher values of (^ 3 2 ) or high sample sizes, the variability due 
to factors of interest like censoring and significance levels are not 
apparent due to the dominance of the covariate effects. Thus the 
efficiency values converge towards the maximum full efficiency of 1.00. 

We first consider type I error which relates to the number 
of times the null hypothesis was rejected when it is true. Figure (5.6.7) 
to (5.6.12) contain such information for the proportional hazard 
cases. The generated samples conform to the a level probability 

limit of the type I error. Asympototic properties of the type I error 
are best summarised in the & ^ = & B 2 = 0. At value of 
S 1 = and B 2 = we have a balanced configuration of the power 
curves in that we note by differing the value of censoring and signifi- 
cance level the power variability is small or nill for sample sizes at 
25, 50 and 100. 



We now consider in this paragraph simulations where 3 1 =0. 
From the no censoring to a 30% censoring with the sample size of 25 and 
the 5% significance level in the range of 8 2 values the maximum deviat- 
ion is a 5% loss. These results in fact complement the earlier 
results on the simple tests. Once again at a higher sample size or 
the 
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the significance level of a = 0.005 the differences diminish. 

For the above configuration the likelihood ratio test and 
the asymptotic normality give reasonably close efficiency values. 
iThe figure (5.6.13) presents these ratios for a = 0.005 at sample size 
of 25 at 6 1 and range of S 2 values , 'The maximum deviation between the 
two tests is 1% and it is at 30% censoring. (This is the only figure 
presented since other sample sizes and a- values do not produce 
figures different from the general pattern.) 

Up until now we have been considering 0L = values. We now 
consider the changing of values of ^ to . 1 , .2, .5 & 1.0 and repeat 
for each corresponding value cf & ^ Clearly for s* >8*, there is a 
slightly higher small sample power distributed at ( 6*, 6*) against 
( 6* , 6*) figure (5.6.7) to (5.6.13). This represents a slight 
lack of symmetry for the Z } and Z 2 ratios thus resulting in a higher 
relative power for 8 2 . This imbalance is most noticeable at the 
lower sample sizes and decreases with increasing sample sizes at 
50 and 100. In the same figures we have also represented the different 
censoring values. As may be expected the value of small sample power 
decreases with increasing censoring levels. Although again by an 
increase in the sample size the effect of censoring is minimised. In 
general a decrease in the significance level also produces a reduction 
in power. Now with reference to the above figures we consider the 
magnitude of relative efficiency for ^ & 6 2 values. In general we 
note that by an increase in the S 2 values the efficiency increases for 
fixed values of On grounds of relative efficiency for a 25 sample 

size/ 
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size with censo-ing present we note that by an increase of 8 2 = 
toB 2 = A, the relative efficiency between no-censoring and 30% 
censoring deviates from a 11% loss to a stable 4%. This 4% relative 
loss of power for censoring is in fact consistantly the same for 
higher values of & 2 , figure (5.6.7). 

For the sample size of 50 a value of relative loss of 5% 
occurs; regularly for most values of and S 2 fro'o to 1, figure 

(5.6.8). The sample size of 100 gives a stable 2% loss of efficiency 
with 30% censoring, figure (5.6.9). Thus apart from the 30% censoring 
for the sample size of 25 with no covariate effects present, the loss 
of efficiency is very reasonable and at worst cases of the sample size 
of 25 a 30% censoring produces a relative loss below 10%. This 10% loss 
however will be discussed later and is far less for a balanced effect. 

A point that may be made here is that if we consider 10% 
and 5% censoring we obtain stable losses of efficiency throughout simul- 
ation even at lower sample sizes. At the significance level of 0.005 
there is a maximum efficiency loss of 12% at the sample size of 25, 
figure(5.6.10). This value does not stabilise and constantly diminishes 
reaching a value of 7% for the higher value, of However at the 

sample size of 50 and the same significance level the loss in 
efficiency due to censoring stabilises at 5% for value of 6 >.1 
and at 8% for B 2 = 0, figure (5.6.11). For the sample size of 100 
the loss in efficiency is a regular 4%. Thus once again reasonable 
efficiency losses are produced at 30% censoring, figure (5.6.12). 
The 5% and 10% censorings even with a significant level of 0.005 produces 
constantly very low efficiency losses. Once again we note that although 
the/ 
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the 16% loss at 30% censoring is not problematic it is pointed out that 
some of this value may be attributed to lack of balance for the covariate 
effect. 



We can thus summarise that in all sample sizes a reduction 
in the values of « reduces the power. In relative terms, the increase 
in power of the test due to increase in the sample size however is 
greater at low sample sizes, and at low type I error levels. Again 
in relative terms the higher censoring effects appear with lower 
sample sizes and low value of a . The relative difference due to 
sigr.ificance level and censoring effects are thus minimised for 
sample size 100, within the range of our simulations. We also note 
that the improvement in the power of the tests from sample size of 25 to 
50 is greater than the improvement from the sample size of 50 to 100. 

As we pointed out earlier an imbalance has been introduced 
into the covariate effects. Thus the power of the tests are slightly 
different for ( &*, ||j and (&*, B *) values. In other words a 
(8, B 2 ) value referring to a particular covariate and treatment effect 
does not refer to a set within which covariate and treatment effects 
have been exchanged. The same condition applies to the varying 
censoring levels and sample sizes. In the higher censorings and 
low sample sizes the effect of the difference in the comparable magni- 
tudes of 8 1 and B 2 appear more substantial. At the sample size 
of 50 and 100 the relative effects of censoring diminish substanti- 
ally and the resultant loss of power from 0% to 30% censoring 
remains the same for ( ||, (J|) and ( &*, 8*). The lack of symmetry 
between the covariates has a resultant power difference of 20% for 
the/ 



the worst case of the sample size 25 at the significance level of 
5%. This value diminishes for the sample sizes of 50 to 10% at the 
worst case and at 100 to bout 7% for the significance level of 5%. 
The relative difference in loss of efficiency for the 1;0% censoring is 
10% for (g , B 2 ) = (.5, 0) and 6% for & 2 ) = (0., .5) as the 

riiost extreme case for the loss in efficiency. At. the sample size of 
50 for the same significance level, censoring and ( B^B^ values we 
note 5% and 3% relative losses in efficiency. 

At the significance level of 0.005 for the worst case at the 
25 sample size and 30% censoring, we obtain a 12% loss in efficiency 
for the lack of balance in the worst case. Although at the extreme 
worst case the relative loss is the same for the two significance levels 
of 0.05 and 0.005, at the latter value the results are more regular 
and towards a higher range of magnitude. At the 50 sample size the 
results of 0.005 significance level is similar to the 0.0'5 level 
both in terms of the magnitude of the worst case and the regularity of 
the losses. At the 100 sample size we note a similar pattern as 
above in relation to loss in efficiency due to the 5% and 0.5% 
significance levels. 

For the relative loss in efficiency of the above discussion, 
the worst case is the relative loss due to the significance level at 
0.005 and 30% censoring giving a 11% loss at 25 sample size. This 
value is reduced to 5% and 4% losses for the samples of 50 and 100 
respectively. 

Referring back to the 20% imbalance of the covariate effect 

Z 1 



Z 1 ^ fUUy balanC6d treat m ent effect %%t in relative terms at the 
30% censoring level the loss in efficiency is more serious for the 
unbalanced variable than the balanced variable. m fact the 11% 
difference at the sample size of 25 diminishes to a reasonable 4% 
loss which is a stable loss for values of significance limit at 0.05 
and 0.005. Although we have considered the only reason for the power 
differences of the above type ,o be those of the lack of balance in 
? 1 Und6r Sma11 Sample Parties, other investigations Kay ( 1 979) have 
presented asymptotic results which are compatible, with our findings. 

In this study we also consider the effect of type of test 
used. That is the efficiency of the asymptotic normality against 
the asymptotic likelihood test. The difference between the tests 
again diminishes for the higher sample sizes and is most pronounced for 
the sample size of 25. Almost consistantly the asymptotic normality 
turns out to be the more conservative test and this is true in particu- 
lar as Bl and ^ deviate from zero. Censorings do not produce 
a major difference on the relative difference of the asymptotic likeli- 
hood and normality. I„ magnitude the maximum difference is at 6% for 
30% censoring and sample size 25 and significance level of 0.005. 
All other relative variability of power between the two tests are less 
than this value. We will return to a comparison of the two tests 
under conditions that the proportionality of the hazard assumption is 
not valid, later in this section. 

UP until now we have daalt with the generation of exponential 
samples in our simulations. Most of the results so far are in fact 
very closely in line with what may be expected of sucn simulations. 
Next/ 
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Next we will consider the generation of Weiball type of distributions. 
Initially we generate the samples in such a way so that the assumption 
of the proportionality of hazards is not violated. Later we will 
generate samples in which there is non-proportionality of the hazard 
present. As shown before we will use p to control the shape of hazard 
rates and for producing non-constant Weiball type hazard rates. Using 

Y = a + 3w - ZjB , 3 - Z 2 6 2 3 
The value Of 3 - P _1 has been fixed at P = 1 so far. Therefore 
all hazard rates have had constant rates for ail subgroups. Now we will 
vary the value of P and proceed with the generation of samples of varying 
sizes that produce proportional hazards of Weiball type with increasing 
or decreasing hazard rates. By the definition of the proportional 
hazards, such effects should play a nominal role in the estimation of 
the 8's. This is true mainly due to definition, that 3 ' s are estimated 
in terms of relative effects on subgroups. It is however known that 
in the estimation of partial likelihood in the treatment of ties and 
also the effect of censorings certain assumptions have been introduced. 
The results for (P = 1.5) increasing hazard and (P = 0.5) decreasing 
hazards are identical when there is no censoring. However with 30% 
censoring there was a slight deviation of 1 to 2 samples in 300 
generations which is nominal. At P = 0.5, that is decreasing 
hazards, with rather high initial failure rate we may notice a 
larger number of failure times at zero, therefore the chances of 
producing tied observations at the beginning of survival times 

is higher and again there is a lower efficiency for these groups. 
Altogether all SSets that were tried, produced very close efficiency- 
value of order of 3 in 300 generations in the extreme worst cases. 
Due to the close similarity of these results we will not produce 
any/ 
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any graphical presentations. However, we study two sample generators 
one at P = 0.5 (decreasing hazard) and one at P = 1.5 (increasing 
hazard) . 

Fix a at 0.0001, fL = -2, 8 2 = - 1 and no censoring n = 25 
we obtain the following estimators 

P = 1.5 p = 0.5 

B 1 = .233 6 1 = .232 

S 2 = .139 6 2 = .138 

Var (fL) = .043 Var (BJ = .043 

Var (B 2 ) = .032 Var (B^) = .032 

Lik (B 1 ,S 2 ) = -16.85 Lik (B-i S 2 ) = -16.92 

Lik (B , 0) = -18.53 Lik (B, » 0) = -18.59 

Lik (0, $ ) = -17.47 Lik ( 0,S 2 ) = 17.51 

The above results clearly indicate very similar estimates for values 
of B ■ This close resemblance is mainly due to the non-parametric 
nature of the method. An interesting question however is related 
to the study of the effects of the covariates when the actual regress- 
ion coefficients are time dependent. This effect can best be 
generated by allowing different subgroups of the patients to have 
different hazard rates. 

In the study of the effects of non-proportionality of 
the hazards we continue with 2 covariate generating models. The effect 
of non proportionality can thus be more complicated in that it effects 
both & B 2 at similar times, simultaneously. We use the same model 
as before however the value of P is dependent on value of Z. 
Hence 



P -> 



1 if Z 1 or Z 2 = +1 



= 1 if Z 1 & z 2 = -1 
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The value of P = 1 for all samples reduces to exponential 
decomposition of the hazard rates. The value of P =f 1 is 
however important in that it indicates a measure of deviation from 
proportionality. 

We repeat the simulation for similar ranges of 6 & 8 
using the same hypothesis with the same sample sizes. This time the 
value of asymptotic normality and the asymptotic likelihoods are of 
special interest. 

In the above we have assumed that time dependency is acting 
equally on the high levels of Z^ & Z^. This need not be the case in 
a more restrictive simulation study. One may allow time dependency 
to be an effect of one of the covariates. A usual manner of analysis 
is to stratify the data into early and late effects, and thus one 
produces two base line hazards for the population. In terms of a 
population with one time dependent effect the two strata should produce 
contours of the type in figure (5.6. 14) in presence of the normality 
of their % estimates. 

St1 and St2 refer to the two strata for early and late 
events when both 8., and8 2 are greater than zero. Although we have 
presented one figure with two different contour sets, it may be 
considered as two different figures for each strata when they are 
superimposed. For the cases of two time dependent covariates or 
a situation where time dependency is latent within the population the 
contour generated by our model may be represented as in figure (5.6.15). 
We have a continuous time dependent effect influencing covariates in 
both/ 
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both 6 1 and ^ directions. 

For reasons of dimensional symmetry we use a transformation 
of P to P* = log ( P ) so that the negative values of F* refer to 
decreasing hazards. We therefore use values of P* set to -.3, -.2, 
-•UO, .1, .2, .3. At first we consider the effect of 

P * °" % & 8 2 * Figure (5.6.16) to 

(5.6.21) . 

Consistently we see a reduction in the type I error less than the actual 
level of significance level « . In relative terms the power decreases 
with increasing deviation from the proportionality of hazards. Using 
figure (5.6.16) and (5.6.17) we note that the reduction in type I 
error is to the reasonable numerical low level of 4% compared to 
the nominal 5% significance level, for the maximum deviation from 
Proportionality at P* = .3 and P* = -.3. Further it is noted that 
at 3l = and 6 2 = 0, there is not a reduction in the relative loss 
of power fin terms of the proportional to the non-proportional hazards) 
with an increase of the sample size, Figure (5.6.18) and (5.6.20). 
Thus indicating that the loss is due to the systematic effect of 
time dependency. For the lower values of increasing and decreasing 
non-proportionality at P* = -.2, -.1, .! and . 2 , the reduction in 
efficiency is also within a range of 4%. m comparison of the relative 
loss of power for the corresponding magnitudes of the increasing and 
decreasing hazards once again we note a reduction in type I error 
of the increasing non-proportionality compared to those of decreasing 

non-proportionality rates. This effect is at 2% for the maximum 
difference of p* = .3 and P* = -.3. 
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Once again we note the asymptotic likelihood and normality 
3t ^2} = ^° °^ * which indicates a measure of type I errors. The 

relative loss of asymptotic normality to asymptotic likelihood is 2% 
at the sample size of 25 with the maximum reduction du e to non-proportion- 
ality at P* = -.3, Figure (5.6.23). For the sample size of 50, 
Figure (5.6.24) this relative deviation of type I error reduces to 
just under 1%, which is at a similar level to the relative difference 
of type I error for the proportional hazard rates. We thus conclude 
that the relative difference in type I error in situations of non- 
proportionality of hazards at the sample size of 25 is at a low 
value of approximately 2% and the relative loss reduces to thote of the 
proportional hazard situation as the sample size increases to 50. 

We continue with the simulations by letting the &^ value for 
treatment effect be at zero and the B. ] covariate effects vary over a 
range of values .1, .2, .5 and 1.0. We note that the non-proportion- 
ality of hazards at the sample size of 25 with significance level at 
5% produces a maximum loss of power of 25% with decreasing non-propor- 
tionality rate of P* = -.1 . This loss is at a reasonable 5% level 
for decreasing rates of P* = -.3. An increase of the sample size 
to 50 reduces this relative loss for the maximum decreasing rate to 
15%. At the sample size of 100 the relative loss of power at P* = -.3 
reduces to a reasonable value which is less than 6% and this value is 
stable for a range of $ 1 values at .1,.2 and .5. We thus state 

that decreasing hazards do lead to a loss of power and the magnitude of 
this loss at = is dependent to some extent on the values of 6^ 

This effect of dependence of 6 1 values on the non-proportionality rates 
reduces to nominal levels at 50 and 100 sample sizes. For the sample 
size/ 



of 25 however, with non-proportionality at a decreasing rate of 

P* = -.3 and the covariate ^effect at a high value of 1 the loss in 
efficiency is unacceptable. 

In terms of increasing hazards again we obtain the same 
results of loss of power. However there is less relative loss of 
power compared with decreasing deviations from proportionality (negative 
value of P*) . m considering various sample sizes once again the 
same conclusions may be drawn. In fact the magnitude of increasing 
hazard rate at P* = .3 produces very stable values of relative loss of 
power at less than 6% for the sample size of 50 and lower values for the 
sample size of 100. At the sample size of 25, the maximum variability 
due to increasing non-proportionality rate is due to P* = .3 
and the value of $ } at its maximum 1.0. Such an effect in terms of 
decreasing rates was noted to be 25% and judged unacceptable but now it 
is reduced to 12%. Although the value of sample size at 50 and 100 
gives reasonable values of power loss due to non-proportionality of 
decreasing and increasing types, extreme caution is needed for sample 
size of 25 in presence of decreasing non-proportionality rates and 
higher value of covariate effect e.,. 

Up until now we have considered the effect of $ values 
on power for the different sample sizes, now we continue with a few 
words on the relative gain in power by increasing sample sizes. 
An increase in sample size of 25 to 50 gives a maximum gain of 14% in 
relative efficiency of the proportional hazard rates. Change in sample 
size from 50 to 100 gives an increase of less, than 8% in relative power 
of proportional hazard rates. Clearly the increase in sample size 
plays/ 
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plays a more significant role for the non-proportional hazard rates. 
At the maximum non-proportionality of P* = -.3 an increase of 27% is 
obtained at covariate value Bl set to 1 . This improvement is important 
in the sense that at higher values of 8 , the loss of power due to 
non-proportionality reduces to an acceptable level. For the sample 
size of 50 tO 100 for a similar generation of 6] and 3 2 values we 
obtain a gain of 13% in power. However this is achieved at a point 
in which the sample size of 100 with P* = -.3 and ^ = 1 has the 
full maximum efficiency at 1.0. 

In the study of non-proportionality we observe that Exp(-.3) 
= P, produces the best representation of the results for b , and e., 
values and thus we continue with this sample for values of &2 > and 
8,< 0, using an analgous one sided alternative hypothesis. Figures 
(5.6.22) to (5.6.25) . 



First we deal with maximum likelihood estimator with constant 
hazard rates. At the constant hazard rate the relative efficiency is 
clearly symmetric about B , = 0, Figure (5.6.22). The figure (5.6.23) 
presents a decreasing hazard rate and there is clearly a lack of 
symmetry about ^ - 0. As we showed earlier values of P < 1 
produced a larger level of variability than P > 1 hazard rates and 
now we note that there is a lack of symmetry about « = 0. The 
following figure can describe what is happening in terms of converging 
or diverging forms of the proportionality of the hazards . 
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The above figure presents increasing, decreasing and constant Weiball 
hazard rates together with positive and negative 6 values. In the 
discussions of figures (5.6.16) to (5.6.21) we presented results in which 
for positive values of 3, the P > 1 simulations were more stable and 
efficient than P < 1 values. Using the above figure we note that 
for 3 > 0, P > 1 implies diverging hazards, while P < 1 implies 
converging hazards. In the describtion of figures (5.6.22) and (5. 6. 23) 
again the above figure can help, in that f or P < 1 we note that S< 
(diverging hazard) compared to 3 > 3 (converging hazard) produces high 
efficiency. That is in either case as may be expected divergence from 
the base line hazard produces higher efficiencies. 



We once again observe that due to the imbalance of 20% for 
the covariate effect, corresponding values of power efficiency of (B^B ) 
and ( 8 2 , 3^ deviate slightly from each other. For the various values 
of treatment effect B 2 and covariate B 1 > we note that there are 
important losses of power with the magnitude, as high as 28% for n = 25. 



In/ 
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In the previous discussions of non-proportionality with 
6 2 = a high efficiency loss in fact presented stable levels of 
efficiency, and often stable levels of efficiency for the ranges of 
B , & 6 2 corresponded with low efficiency levels. m here as we pointed 
out in the discussion of non-proportionality at sample size of 25 
and 6 2 = extreme caution is needed if ever used in practice. 
However there is a pattern emerging from the sample si z e of 25 
simulations which represent the relative efficiency in terms of 
variability of ^ and ^ and P*, Figure (5.6.23) 

The worst region in terms of relative loss of power between 
figures (5.6.22, and (5.6.23), is due to the covariate effect values 
of ^ between and .2 for ^ values greater than 0#2# and g 
values between and 1 for ^ values less than .2. The losses for 
the earlier group ranges with 15% efficiency loss and the latter group 
produces efficiency loss of 28% 



For the negative values of ^ we observe a pattern is 
emerging indicating that for values of a f < -.2, the relative loss in 
efficiency in comparison to the proportional hazard samples is dec- 
reasing steadily. In fact between 8, values of -.1 and for 
each particular level of ^ the difference in efficiency is within 
24% , while outside of this range of generations of 8 values the 
efficiency is closer to the proportional hazard situation. 



Thus we summarise that with the sample size of 25, there 
is a loss of symmetry Figure ( 5.6.23). This relative difference for 



s 7 < 0/ 
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,< • - », > o u rested „ lu , an in „ Mse in smpie 

It must be noted that 8 > r > n -„u 

, , 6 2 > wxth P* > is not comparable 

Wlth 6 1 < and 6 2 > o and P * < . 



We now proceed wl*K a .1.11 

Uh 5 S1 " Uar «-°^tion as those of sample 
— 2 5 „ tt the non-^ottlonallt, S e t fcfl . ..3. ^ ^ 

« — P4WM (5 . 6 . 24) 

— * *. », - a.s „he„ the sample sl 2 e „ in oteasea. In ^ 
«*. of ana Bs valaes „ also note doubUng ^ ^ 

than 2 ero „e o btai „ a „ aJti m U „ loss „ sf£ioiancy of n tot this 
—olat ^ o£ n o„- proportlODality , i£ „ e confine ^ posuive 

»'V Per the „ agatlve valnes of , , th e los s in 
x. even less when P topottio„a llly doe= „ ot ^ ^ ^ ^ 

SO the ^ loss due „ non . proporticnaluy u ^ ^ ^ 
to 7% for the sample size of 100. 

So far in the discussions of non- P ro P ortionalit y we have onl y 

mentioned the asymptotic likelihoods a. 

e-ixnoods. As we pointed out the results 

from asymptotic normality follow a very close na** 

ver* close pattern when we deal with 

* pro P ottio„al hasara situation. Thi s aeviation ln oteases „ it h ^ 
- fro* P ropo r tio„ali ty , however remains at ^ levels for ^ 
sample sizes of 50 ana 100 generations. 

- « continue „it„ the simulations fot c he non-proportional 

case/ 
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case and use graphical representation on ^ * °' p * = ~«3 and repeat 
for B 1 > 0, B 1 < and sample sizes 25, 50 and 100 and a set to 5% 
and no censoring present. in the data figures (5.6.23) to (5.6.25). 
There is a slight indication that asymptotic normality behaves in 
a more symmetric manner on the two sides of the 8^ axis. The 
asymptotic likelihood however is consistently less conservative than 
the asymptotic normality test. The relative power difference of 
the two tests diminishes with increasing sample size. The actual 
magnitude of parameter | 8^ | are clearly playing a role in the power 
of the tests. Generally the increase in value of |f$^ reduces 
the relative power difference of the asymptotic likelihood and normality. 
This is partly due to the fact that non-proportionality variability is 
reduced by the increase in sample size and partly by the actual covari- 
ate effect becoming more dominant and thus producing a reduction in 
its variability. 

Finally we present the tables (5.6.1), (5.6.2) and (5.6.3) 
which give the various values of range of 6 1 and &^ values used in the 

simulations and the corresponding 8. and B_ estimates with their 
variance, under the proportional hazard assumptions. Clearly the 8 
estimators are very close to the.- actual 8 values. There is a 
negligible bias present over the range; of the simulations for the 
given covariate sez, which declines with increase in the sample size. 



1 

--1 

-.5 
-.2 
-.1 


.1 
.2 
.5 

1 

-1 
-.5 
-.2 
-.1 


1 

.2 
.5 



.5 
.2 
1 

1 

.2 

,5 















.2 
.2 
.2 
.2 
.2 
.2 
.2 
.2 



1 

-.9946 
-.4961 
-.1987 
-.0993 
.0004 
.1012 
.2019 
-£.051 
1 .0073 

-.9948 
-.4962 
-.1978 
-.0993 
-.0004 
.1011 
.2017 
.5049 
1 .0070 

-.9949 

-.4965 

-.1991 

-.0994 

-.0004 

.1011 

.2014 

.5049 

1 .0069 



I, VAR (g. 



.077 
.057 
.043 
.041 
.031 
.043 
.047 
.059 
.079 

.075 
.056 
.043 
.041 
.030 
.042 
.046 
.057 
.078 

.071 
.054 
.041 
.039 
.030 
.041 
.046 
.055 
.074 



*2 
.0005 

.0006 

.006 

.0007 

.0007 

.0007 

.0007 

.0006 

.0005 

.1008 
.1008 
.1009 
.1009 
.1009 
.1009 
.1008 
.1008 
.1008 

.2010 
.2011 
.201 1 
.;>013 
.2013 
.2012 
.2012 
.2009 
.2006 



VAR [$, 
.026 
.027 
.027 
.027 
.027 
.027 
.027 
.027 
.027 

.032 
.033 
.036 
.038 
.039 
.037 
.037 
.036 
.033 

.042 
.043 
.043 
.044 
.044 
.044 
.042 
.041 
.041 



5 
5 
5 
5 

ks 

5 
5 
5 



-.9956 
-.4971 
-.1992 
■r.0996 
.0003 
.1010 
.2011 

.5045 
1 .0063 



.064 

.051 

.040 

.037 

.029 

.037 

.043 

.052 

.069 



.5032 

.5035 

.5038 

.5038 

.5039 

.5037 

.5034 

.5031 

.5027 



.050 

.051 

.051 

.052 

.052 

.051 

.050 

.049 

.047 



-.9962 .054 1.0060 .050 

-.4978 .047 1.0066 .058 

"•1994 .038 1.0068 .064 

-.0997 .033 1.0069 .067 

-.0002 .029 1.0070 .069 

.1008 .035 1.0069 .066 

•2001 .041 1.0067 .062 

.5043 .050 1.0065 .059 

1.0059 .057 1.0059 .045 

Table (5.6.1) n = 25, no censoring, P* = 



1 

-1 

-.5 
-.2 

t31 


.1 

.2 
.5 
1 

-1 
-.5 
-.2 
-.1 

U 
.1 
.2 
.5 
1 



















1 



-.9'J75 
-.4981 
-.1989 
-.0998 
-.0003 
.1007 
.2012 
.526 
1 .0039 

-.9975 
-.4982 
-.1990 

-.0997 

-.0003 

.1006 

.2011 

.5025 

1.0038 



VAR ( % 
.033 
.025 
.021 
.019 

.015 

.019 

.021 

.027 

.034 

.033 

.025 

.020 

.019 

.014 

.019 

.021 

.026 

.034 



8 2 
.0003 

.0003 

.0002 

.0003 

.0003 

.0002 

.0003 

.0002 

.0002 

.1006 
.1004 
.1006 

.1003 

.1002 

.1003 

.1004 

. 1004 
.003 



VAR (3 2 ) 

.015 

.016 
.016 
.016 
.016 
.016 
.016 
.016 
.016 

.017 

.018 

.021 

.021 
.021 
.020 
.020 
.000 
.048 



_1 -2 -.9979 

_ - 5 .2 -.4983 

"•2 .2 -.1993 

-• 1 .2 -.0999 

-2 -.0002 

• 1 .2 .1005 

• 2 -2 .2010 

• 5 -2 .5024 

1 .2 1.0037 



.031 

.024 

.019 

.018 

.015 

.019 

.021 

.024 

.033 



.2008 

.2009 

.2009 

.2009 

.2008 

.2008 

.2007 

.2007 

.2006 



.022 

.023 

.023 

.023 

.023 

.023 

.022 

.022 

.021 . 



-1 .5 -.9978 

--5 .5 -.4982 

-.2 .5 -.1992 

-%i %5 -J50998 

.5 -.0002 

••1 -5 .1005 

•2 .5 .2011 

•5 .5 .5025 

1 .5 1.0037 

-1 1 -.9980 

-.5 1 -.4983 

-.2 1 -.1993 

-.1 1 -.0999 

1 -.0001 
•1 1 .1005 
•2 1 .2010 
•5 1 .5024 

1 1 1.0037 

Table (5.6.2) 
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•027 .5019 .077 

•023 .5017 .026 

•019 .5017 .026 

•018 .5015 .026 

•014 .5016 .025 

•018 .5017 .025 

.020 .5017 .026 

•024 .5017 .027 

•029 .5018 .027 

•023 1.0029 .024 

•021 1.0029 .027 

•018 1.0030 .030 

.016 1.0032 .032 

•013 1.0032 .033 

•017 1.0031 .031 

•019 1.0029 .030 

•023 1.0028 .028 

•025 1.0027 .025 

n = 50, no censoring, P*=0 



1 

-1 

-.5 
-.2 
-.1 

.1 
.2 
.5 
1 

-1 
-.5 
-.2 
-.1 

.1 
.2 
.5 
1 

1 

.5 

.2 

. 1 



.1 

.2 

.5 

1 















2 
,2 
.2 
.2 
.2 
.2 
.2 
.2 



1 

-.9989 
-.4990 
-.1996 
-.1001 
.0000 
.1005 
.2008 
.5017 
1 .0019 

-.9989 
-.4991 
-.1996 
-.1001 
.0001 
.1005 
.2008 
.5017 
1 .0020 

-.9990 
-.4991 
-.1997 
-.1001 
.0001 
.1005 
.2007 
.5016 
1 .0021 



VAR $ ) 
.018 
.015 
.013 
.012 
.011 
.011 
.012 
.016 
.019 

.018 
.015 
.012 
.012 
.011 
.012 
.012 
.015 
.019 

.017 
.014 
.012 
.012 
.011 
.013 
.013 
,014 
.08 



"2 
.0001 

.0001 

.0001 

.0001 

.0000 

.0001 

.0001 

.0001 

.0002 

.1004 
.1004 
.1004 
.1005 
.1005 
.1004 
.1004 
.1004 
.1004 

.2008 
.2008 
.2007 
.2007 
.2007 
.2007 
.2008 
.2008 
.2009 



VAR (f. 
.011 
.011 
.012 
.012 
.012 
.012 
.012 
.012 
.011 

.012 
.013 
.015 
.015 
.014 
.014 
.014 
.013 
.013 

.015 
.015 
.015 
.015 
.015 
.015 
.015 
.015 
.015 



-1 .5 -.9990 

-.5 .5 -.4992 

-.2 .5 -.1998 

-.1 .5 -.1001 

.5 .0001 
.1 .5 .1004 
.2 .5 .2007 
.5 .5 .5017 

1 .5 1.C023 

-1 1 -.9991 

-.5 1 -.4995 

-.2 1 -.2000 

-.1 1 -.1001 

1 .0000 
.1 1 .1004 
.2 1 .2007 

.5 1 .5015 

1 1 1.0023 

Table (5.6.3) n = 
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.015 .5015 .018 

.013 .5014 .017 

.011 .5014 .017 

.011 .5013 .016 

.010 .5012 .016 

.011 .5013 .016 

.012 .5014 .017 

.015 .5014 .018 

.016 .5015 .018 

.013 1.0021 .014 

.012 1.0020 .016 

.011 1.0019 .019 

.010 1.0017 .019 

.010 1.0016 .019 

.011 1.0017 .018 

.012 1.0019 .017 

.013 1.0020 .017 

.015 1.0022 .016 
100, no censoring, P*=0 



Earlier we showed that in the study of the Cox's method the exponential 
and Weiball generated samples are very similar in terms of testing 
significance of covariates and in fact the interesting situation is 
that of effect of non-proportionality of the hazards. Now we will 
consider the simple tests of the treatment effect for the various 
values of the non-proportional hazard generations. The tests once 
again correspond to a similar set of @ 1 and & 2 values, both greater 
than zero. In the following generations however we will repeat the 
simulations and the analysis of the generated sample according to differ 
ent generalised linear models. As we described in chapter 3 the most 
commonly used models in this respect are the Weiball and the exponential 
models. We will report the simulations initially for the proportional 
hazard generations. In the analysis we will consider (a) the fixed 
covariate Cox's method, (b) time dependent Cox's method which is more 
suitable for the non-proportional situation, (c) stratified Cox's method 
(d) Weiball model with the generalised linear model assumption and 
finally (e) the exponential model. 

The non-proportional generations are all of the Weiball type. 
This is an arbitrary choice in so far as deviation of the exponential 
decomposition of the relative risk is concerned. For the purpose of 
the analysis we deal only with exponential and Weiball parametric 
models. These two models are in practice the most relevant for the 
decomposition of the relative risks in survival studies. Due to an 
introduction of non-proportionality into the generated samples an 
alternative approach based on the non linear models of Weiball type 
may be possible. However in this respect the interpretation of the 
6 estimates/ 



non TOtionality mentloned in parMetric weibaii ^ ^ ^ 
» tha Meiball _ the u _ ntiany 
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A frffftt poi nt that reguires some attention is relafced fco 

the results of the previous study of ■ 

7 ot 8 2 efficiency m. the presence of 3 
We as Sume d there is no correlation between 8 and « „ 

" S 1 and s 2 and ^us there was 
no =o n£ounding e£fects present _ ^ _ ^ ^ ^ 

repeat the same values of 8 and 8 snA a i 

1 d 6 2 and ana lyse the samples with the 
above mentioned models. Once flnsin „ 

Once again there is a distinction in that 

absence of confounding between 1 and 8 entaiJ s a , n - 

7 d 2 entails a constant relative 
Power for the estimation of treatment effects For «. M 

tS * Por th e time dependent 

situation however any loss of n n„ a • 

y of power is essentially attributed to the use 

of wrong models. 



a cogent, on th . inferiority „ ^ ^ ^ 

* o £ „ hat tollo „ s „ [ake a si9nifioan=a ^ ^ ^ ^ ^ 

oens 0ring situations . lnititaUy ^ considet ^ ^ ^ ^ 

figure(5.6.26) . Af « 

62 t0 Zer ° th6re is dement between all 

the models at power levels around 05 Thi. , • 

na 0.05. This value is constant and does 
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not ch ang e with i„creasi„ g valaes of S) f _ to 1 . Next „, consider 
vslues of gc e a ter than 2ero . „ e ^ ^ ^ ^ 

hazerd g ener a tions consistently, the parametrio ^ ^ 
power then the other .odels. The „ orst model (i ,„ oring e „ ial) 
»he„ the sables heve proportion.! h a2a rds „ the Co*.s non-proportionel 
hazard «*el. The other ^els Co*, proportional hazards and the 
stratified Cox-, mo del hoth have power properties between the Weihali 
and the non-proportionel Co, a „d this is true for all v alue s of , and 
S 2 simulations. 

The difference between the power of Coz's non-proportional 
hazard and the Weiball proportional hazard is about 5% . near , 
and about 9% near » f . „ when 6j is equal to 0.2 for both. The' 
difference increases with increasing values of 8j so that at « 2 . , 
with v a lues near 6, . „, the di££erence u , 2 , ^ ^ ^ . ^ ^ 

There is clearly . lack of consistenoy fa th . poi J etfioiency Qf 
so-e of the .odels. in so f a r as v a l„,s of 6 , a re concerned. The m ore 
superior models namely Coz's proportions! hazards end the Weibell in 
fact consistently produce the sa»e power value for a given regardless 
Of values of 8) . The difference in power for any g ive„ value of , 
in fact for either of «... proportional hazard or Weiball does not 
vary by „ore than 3% over the ra„ g e of values. Por the less 
appropriate „odels, the Coz's non-proportional hazard, we note a 

^^eclining rrend at „. 25,wi t h increasing 6] ^ ^ 

example at a =5 and a , n +u 

* 2 and S, - the power xs 69% and the value declines to 

63% when fjj^ = 1 . 



The stratified Cox^s model also produced consistantly similar 

values/ 
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values of power for the test of B„ regardless of B^, although there 
is a slight loss of power compared to the Cox's unstratified model. 
The difference between their power value is almost consistantly 4% 

Next we consider the sample size of 25, significance level 0.05, 
the increasing non-proportionality at P* =0.3 and the decreasing non- 
proportionality at P* = -.3. Generally the power values of the P*>0 
are slightly superior to similar values of P*< 0. This is mainly due 
to the convergence or the divergence of the hazard rates for the given 
range of B^ and &^ values. 

First we deal with P* = +.3, figure (5.6.27). At & 2 = the 
power of all tests and all models is once again very close to the value 
of the significance level 0.05. In fact there is very little to 
separate the power of tests according to type of the model or the range 
B 1 values. For values of B 2 > ® once again there is a slight difference 
between the power of tests according to the type of the model. Consis- 
tantly the exponential is the worst model in the analysis of non-propor- 
tionality. The best model for such samples is the Cox's non-propor- 
tional hazard model. The stratified Cox's model also produces relative- 
ly superior power values compared to the Weiball or the Cox's proportion- 
al hazard models. The non-proportional hazard model produces very 
consistant power values for B 2 regardless of B 1 values. This point is 
in fact also true of the stratified models of Cox. The two less 
powerful tests in the analysis of non-proportional samples are the 
proportional hazard Cox and the proportional hazard Weiball. At the 
value of 6 2 = .2 we note that the value of S 1 does not effect the power 
of any of the tests and the maximum difference for the ranges of 
value/ 
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value is 3%. 

There is a lack of consistancy in the power of g 1 tests as the 
magnitude of & 2 increases. This pattern is not present for the 
correctly specified models namely Cox's non-proportional hazard and 
the stratified Cox's model.. With the proportional hazard model Cox 
and Weiball however we note a decline in the efficiency of the tests. 
The decline in efficiency for the 6 2 =.1 over the range of 6 1 from 
to 1 is about 9% if a proportional hazard. model is used. Before we; 
finish this point however, we must remark that this pattern is present 
at this magnitude only at the relatively low sample size of 25. The 
value of P* = -.3 f igure ( 5 . 6 . 28) , produces non-proportionality which 
implies generally a higher loss of power compared to P* = .3. At the 
value of 3 2 = 0, the power is at about 0.05 for all tests and all values 
of B. ( . However there is a slightly higher variability over the range 
of 6 1 values for the different models compared to the situation of 
P* = +.3. 

On increasing values of B 2 there is a general increase in the 
overall power of tests, which indicates that for all models, values of 
8 2 is the major factor influencing power. The pattern is similar to 
P* > 0, indieating that non-proportional hazard Cox's model and the 
stratified Cox's method are the superior models. This once again 
indicates that the correct specification of model implies a general 
constant power for the & 2 values regardless of 6 1 values. The 
worst situation occurs for the Weiball and the Cox's proportional 
hazard model in the P* = -.3. There is once again a slight indicat- 
ion that increasing values of 1 may influence the power, which is to 
some/ 
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some extent due to a low sample size and partly due to lack of 
balance due to non-proportionality. 

By an increase of sample size to 50 r figure (5.6.29) to 
(5.6.31) and to 100, figures (5.6.32) to (5.6.34) a similar pattern as 
before is repeated. However the differences between the appropriately 
fitted models and the unsuitable models in either of proportional and 
non-proportional hazard situations decline. Under the proportionality 
of hazards, Weiball, is in fact the most suitable model and produces the 
highest power of the tests. (it must be noted however that the gener- 
ation are also of proportional hazard of Weiball type). Cox's 
Proportional hazard is also suitable in that it is not influenced by 
varying values of 6 , . The two relatively unsuitable models are 
stratified Cox and the time dependent Cox, although their loss of 
efficiency is relatively small. la the analysis of the generation of 
non-proportional hazard type both Weiball and proportional hazard Cox 
decline in power. 

As we pointed out earlier, clearly the exponential model is the 
least suitable model and we have included the model purely for reference 
in the graphs. 

In conclusion, the stratified Cox's model and the time dependent 
Cox's model are both suitable for the analysis of non-proportional 
generations, * he value of covariate effect in this respect does not 
vary the power of the test. As may be expected the power of the test 
is purely dependent on the magnitude of the treatment effect. In the 
situation of proportional hazards both Weiball and Cox's proportional 
hazard/ 
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hazard have good power properties. Once again the power is dependent 

on the magnitude of the treatment effect and is not influenced by the 
covariate effects. This is in opposition with the findings of a similar 
study in situations of proportional hazards where unreasonable loss of 
power is detected due to the magnitude of the covariate effect (C.L. 
chastong 1983) . In our study specification of wrong models does imply 
a loss of power which* with the small sample size of 25 can become 
unreasonably dependent on the magnitude of the covariate effect. 

For all practical reasons the semi non parametric methods 
provide a robust construct for the analysis of the data. 
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CHAPTER 6 

ANALYSIS OF THE OLD EDINBURGH DATA 

In this chapter we proceed with the analysis of data from 
a clinical trial. The purpose of this chapter is to illustrate some 
of the results of the previous sections, using the proportional hazards 
model. Since the analysis of the breast cancer trial data is the main 
part of the discussion, we will begin this chapter with a history of the 
treatment of the disease. In the later sections a general overview of 
the subject will be presented. Then our data is described and the 
procedures which were adopted to collect it will be presented. Finally 
we analyse the data using the general methods with a single evetttof 
interest and multiple coefficient models with tests of interactions. 

In the present chapter we only consider time independent 
covariates, however in chapter 7 we will deal with time dependency 
and multivariate risks. 

6.1 Randomised Trials in early breast cancer . 

Breast cancer is the most common form of cancer among women 
in the Western Hemisphere. Despite this, there is no general agreem- 
ment as to the best treatment of an early case. This disagreement is 
related/ 
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related to both types of surgery and the value of radiotherapy. 
Recently various forms of post operative drug treatments have also 
added a new dimension to the decision making efforts. 

Breast cancer is one of the few malignant diseases in which 
there are well documented data on long term survivals in untreated 
patients in existence. The earliest efforts for the purpose of the 
treatment of the disease took place some 80 years ago. However 
later, during the 1950's, epidemiologists gathered the first empress- 
ive arguments against the use of the established treatment of the 
time which was radical surgery. It seemed that treatment did not 
cure the patients in terms of their long run survival or proportion of 
the development of metastatic disease. 

Following the above developments many studies were carried out 
to assess a range of different treatments which consisted mainly of loco- 
regional treatment by various forms of surgery and radiotherapy. Sub- 
sequently some ovarian ablation by oopherectomy or by irradiation has 
also been used. None of these treatments , however, produce a major 
improvement in terms of over all survival. Much of this lack of success 
in treatment had been ascribed to the fact that patients with the 
possibility of developing metastatic disease are not influenced by 
the loco-regional treatments and the development of the metastatic 
disease has not been attacked by the treatment. 

At present there are new trials taking place in which surgery 
is followed up by chemotherapy. The value of such drug treatments 
occurs/ 
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occurs by not only considering the benefits in terms of local 
progression but in terms of systemic general progression of the 
disease. Some of the methods we study in the next chapter are in 
fact appropriate for the proper assessment of the effects for this 
form of trial. 

Often the evaluation of the treatment of breast cancer is 
made difficult by the fact that patients differ considerably in their 
individual form of the development of the disease. Various prognostic 
factors in the past have been assessed in terms of effects of various 
treatments. Some of the indicators that have been given an 
importance in the past are, the size of the initial tumour, axillary node 
involvement and the menstrual status. In chapter 8 we will deal in more 
detail with the important prognostic effects. Generally size of the 
tumour is invariably related to survival and this is a result that has 
been shown to be true consistantly . Another strong prognostic 
indicator is the extent of axillary node involvement. This can be 
measured as a form of index with involved or not involved categories; 
or by an index representing extent of the involvement by the number 
of nodes examined and the number that were found to be involved. 
Age and menstrual status are two factors that are closely related 
to each other and ats of less strength in assessing the chance of 
survival of a patients due to the disease in comparison with node or 
size. 

It has often been shown that with increasing age the chance 
of survival increases until the menopause. After the menopause the 
survival/ 
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survival rates decline at a slower rate. The effect of the other prog- 
nostic factors are also present if we do a separate stratified analysis 
of the various age categories. 

6 .2 Description of the data . 

The objective of this trial has been to assess the pattern 
of survival rates for a group of patients with the invasive carcinoma 
of the breast, who were treated in a clinical trial in the South East 
of Scotland from 1964 to 1971. 

In the protocol, the general criterion for selection was 
taken to be, all female patients between the ages 35 - 69 inclusive. 
Further it was considered essential that all patients must be suitable 
for treatment by either arms of the trial, so that a reasonable level 
of homogeneity of patients is established in terms of prior treatment 
status. 

The two trial options were: 

(1) Radical mastectomy: The breast, the pectoral muscles and the 
axillary cortices were removed. 

(2) Simple mastectomy plus post-operative radiotherapy: The breast 
was removed from the fascia overlying pectoralia major via an 
elliptical oblique incision. This included the nipple and the areola. 
Post operative radiotherapy was given by a 2 Mer vander Graff generator. 
The axilla and the supraclaricular fossa were irradicated using parallel 
semi-opposed fields to 4250 rad. maximum dose in 10 factions in 4 
weeks. The chest wall and the internal mammary nodes were irradiat- 
ed/ 



by parallel tangential fields to 4500 rad. in fractions in 4 weeks. 



All patients were categorised into stage 1,2 and 3 patients 
according to the then currentlnternational Staging Systems based on 
TNM, (codes for Tumour size, Nove involvement and Malignancy status 
respectively) . Cn chapter 8 we consider the development of the 
TNM staging in greater detail. 

Thus the stage I patients were composed of patients with 
tumours of size 5 cm or less in the maximum diameter, Skin fixation 
absent or incomplete, nipple may be retracted or pagets disease present, 
pectoral muscle fixation absent, chest wail fixation absent, no homo- 
ateral axillary nodes palpable and no distant metastases present. 

Stage- II patients had primary tumours as in Stage I but 
also include homolateral axillary nodes palpable, movable and not 
fixed to one another. 

Further certain members of Stage III were also defined as 
elligible to take part in the study. Such cases may have tumour of 
any size, skin fixation complete or ulceration not exceeding 3 cm 
in diameter, peau d' orange present in tumour area only, pectoral 
muscle fixation complete or incomplete. Stage III patients which 
were excluded were cases with skin involvement wide of tumour or 
ulceration greater than 3 cm, peau d' orange wide of tumour, chest 
wall fixation present, homolateral axillary nodes fixed to each other 
or to adjacent structures, oedema of the arm, homolateral supra- 
clavicular/ 
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clavicular or infraclavicular nodes movable or fixed. All Stage 
IV patients were excluded. These are in fact patients with distant 
metastasis. 

Apart from the above, certain other patients were excluded 
from the trial :- 

(a) Previous treatment for carcinoma of the breast. 

(b) Bilateral breast carcinoma 

(c) Any other malignancy- 

(d) Breast carcinoma having arisen during or presenting in association 
with pregnancy or lactation. 

(e) Previous bilateral ovariectomy cr pelvic irradiation 

(f) Peripheral vascular disease of the upper limb. 

(g) Certain tumours in axillary tail unsuitable for treatment by radio- 
therapy because of position. 

Patients with advanced disease are usually subjected to high 
risks under operation. On ethical grounds this entails the exclusion 
of all such patients from the arms of the trial. For scientific 
reasons a few conditions in this respect are of importance. Advanced 
patients have often short survival times due to external factors and thus 
their distribution can mask the treatment survival patterns. The 
number of such patients is often small and an unbalanced distribution of 
these patients can make the treatment comparisons controversial. There- 
fore in order to obtain more uniform groups of patients for the final 
comparisons it is reasonable to exclude the advanced patients. 



All/ 
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All elligible patients were further stratified according to 
age and stage of the disease. Such criteria were taken to be the 
minimum data necessary for a random allocation of patients into the 
trial. Clinical stages form 3 strata (a) stage I, (b) stage II and 
(c) stage III. Age is also categorised into three strata, (a) 35-44, 
(b) 45-59 and (c) 60 - 69. 

A randomisation office was set up and on receipt of name, 
age and the stage of the disease at the initial examination from a 
peripheral hospital, the units concerned were informed of the treatment 
by telephone and by writing. 

The benefits of a stratified allocation of treatments can 
be maximised by an accurate assessment of the categories. Age seems 
not to be a major problem since it is a single measurement in a 
continuous scale. However stage is a collection of various infor- 
mations based on T, size and N, node staginq. 

The M staging, presence of metastatic disease in this trial 
reflects only a group of inelligible patients and it is important that 
assessment of presence or absence of metastatic disease is very 
accurate. In order to reduce the chances of including cases with 
skeletal metastasis, it was stated in the protocol that X-rays of 
chest and pelvis should be taken in all the cases included in the 
trial. It stated further that if possible this should be done by 
the surgical unit and films must be sent with the patients. 



The/ 



The total group of patients who were randomised by this 
procedure included approximately 50, „ patients found at the time of 
operation to have benign breast disease. These patients were excluded 
from the trial for all purposes. As we win point out ^ ^ 
Procedure resulted in th e allocation of unegual numhers of patients for 
each comparable strata of the treatment arm. However the final imbal- 
ance in terms of the number of the malignant patients is not of practi- 
cal importance. 

6.3 Recording o f Information. 

A general procedure was adopted so that information on the 
Patients could be standardised and so processed by a computer. However 
it was noted in the last review of the data performed in 1981 that 
certain concepts and categories defined by the protocol were not in 
accord with more recent practice. Most of these changes did not 
create a major problem of interpretation. The major source of 

inconsistency among the changing definitions seemed to be concepts 
related to the recurrence of the disease. m fact in the original 
Protocol there was no me ntion of the definition of recurrence of the 
disease, although in the data forms space was allocated to recording 
of such information. 

Basically there were 4 standard forms available for processing 
Form 1 - the Initial Examination fori,. Figure (6.3.1) 
Form 2 - The Primary Treatment form , Figure (6.3.2) 
Form 3 - The Anniversary Record form, Figure (6.3.3) 
Form 4 - T he Pathology Report form, Figure (6.3.4) 



FORM 1 
INITIAL EXAMINATION 



Serial Number 



279 



SURNAME 



GIVEN NAMES 
ADDRESS 



COUNTY 

■ 

UNIT 



SURGEON 



MARITAL STATE - Enter M or S in box 
DATE OF BIRTH 



- Enter day 01 to 31, month 01 to 12, 
and last two digits of year. 



(AGE:- 



) 



Month 



3 



MENSTRUAL STATE 

Premenopausal — 

Menopausal — 

Post-menopausal — 

AGE AT MENOPAUSE (yeurs last birthday) 



enter 1 
enter 2 
enter 3 



HISTORY AND CLINICAL FINDINGS 

DATE FIRST SYMPTOM OR SIGN NOTICED 
PRIMARY TUMOUR 



Day 


Month 


Yeoi 















SIDf - Enter R or L 
SITE 




Medial half only 
Lateral half only 
Central 
Both halves 
Whole breast 
Unknown or other 



SIZE - greatest diameter in cm. 
TMN CATEGORIES 

T — enter appropriate number 
N — enter appropriate number 

CLINICAL STAGE - Enter appropriate number 

If tumour is STAGE III: 

(a) State SKIN INVOLVEMENT (Tl, 2 or 3) 

(b) State PECTORAL MUSCLE INVOLVEMENT 

(Tl or 3) 



SELECTED TREATMENT OPT ION 

Entar appropriate :ode - Rl, R2, SI or S2 
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FORM 2 
PRIMARY TR E ATM E N T 



Serial Number 



SURNAME 



GIVEN NAMES 
ADDRESS 



PRIMARY TREATMENT 





D< 


y 


Month 


Ye 


or 


Date of first treatment 













SURGERY (enter if NO and 1 if YES for each item below) 

Si 



impie mastectomy 
Node or nodes removed 
Part of pectoral fascia removed 
Part of pectoral muscle removed 

Q J . | W 

Koaical mastectory 
Closure without skin graft 
Closure with skin graft 



RADIOTHERAPY - 



DAYS IN HOSPITAL 
enter if NO, 1 if YES 



Min. Dose (enter rods) 
Max. Dose (enter rads) 
Time (weeks) 



SUPPLEMENTARY TRFATMFMT 



□ 



If none _ 
If ovariectomy - 
If ovarian radiation - 

If ovarian radiation 

Completed - 
Incomplete — 



enter 
enter 1 
enter 2 



enter 1 
enter 2 



FORM 3 
ANNIVERSARY RECORD 



Serial Number 



SURNAME 



GIVEN NAMES 
ADDRESS 



LATE COMPLICATIONS 



Anniversary year 



Enter if NO, 

1 if YES 
in each case 



-OCAL RECURRENT ~~ 
Date of first evidence of local recurrence {if |„ this anniversary year) 



( Oedema of arm 

( Limitation of shoulder movement 
( Other late complications 
Specify 



Enter if NO, 
1 if YES 
for each item 



( SITE 01 Chest wall 
( 02 Axilla 

( 03 Supraclavicular fossa 

Enter number (above) of first sit. -t ■ ^ ^ Internal mammary node 



Dote of first evidence of distant metastasis (if in this anniversary year) 

Enter if NO ( SITE 06 Skeleton 

. V! K • ( 07 Lung 

tor each item I nn D , , 

j 08 Pleural effusion 

( 09 Other 

SECONDARY TREATMENT (commenced in this anniversary year) 

( Surgical excision of metastases 
( Radiotherapy 



Enter if NO, 
1 if YES 
-or each item 



( Hormone therapy (oestrogens, androgens or steroids) 

Endocrme surgery (oophorectomy, adrenalectomy, hypophysectomy) 
( Cancer chemotherapy 7K H"y«ciomy) 

( Other 



Specify 



LATH during this anniversary year (Enter if NO, 1 if YES) 



:ause of death 



If YES: Date of death 



other cause was: 



If carcinoma of breast 

If other cause but recurrence of breast 

carcinoma present 
If other cause but no evidence of recurrence 

of breast carcinoma 



Complication of primory treatment 
Complication of secondary treatment 
Other primary neoplasm 
Other intercurrent condition 



Specify 



enter 1 ) 
enter 2 ) 

) 

enter 3 ) 



enter 1 ) 

enter 2 ) 

enter 3 ) 

enter 4 ) 




(If due primarily to carcinoma of breast enter 0.) 



FORM 4 



' ATHOLOGY 



Serial Number 



•JRNAME 



:-:VEN NAMES 
PDRESS 



-ABORATOR-Y 



PRIMARY TUMOUR 



Sarcoma; specify 
Non Malignant"- 



- 

- 1 

- 2 



- greatest diameter in cm. 
* N.S. - XX 



I-scriptive 



Scirrhous 


- 


Comedo 


- 1 


Papillary 


- 2 


Medullary 


- 3 


Mucoid 


- 4 


Squamous 


- 5 



Paget's 
Other; specify 
More than one 

* N.S. 

/ N.A. 



- 6 

- 7 

- 8 

- X 

- Y 



s-entiation 



Well Differentiated 

Moderately Differentiated 

Poorly Differentiated 

Anaplastic 

N.S. 

N.A. 



- 

- 1 

- 2 

- 3 

- X 

- Y 



ype 

Pleomorphic 
Lcrgc Cell 
Small Cell 
Spheroidal 
Duct Cell 
M.S. 
N.A. 



- 

- 1 

- 2 

- 3 

- 4 

- X 

- Y 



N'o introduct tumour noted - 

itraduct tumour present — 1 

Introduct tumour alone - 2 

M.S. _ x 

'••A. _ Y 
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Form 1 provides the necessary information used by the 
surgeons to make the initial decisions regarding eligibility and the 
stratification of the patients. On this form all the information re- 
garding staging is placed. Also site of the disease is recorded. 
At the end of this form the treatment that is allocated is recorded. 
The treatment categories R1 , R2, SI, S2 include a random division of both 
the radical mastectomy group and the simple mastectomy plus radiotherapy 
group into subgroups. At the time of trial design, many of the 
currently available statistical techniques were unknown, and the 
initial intention for the design of these subgroups was to allow some 
crude estimation of the effect of randomisation. These subgroups are 
ignored in this thesis. 

Form 2, primary treatment form: This form records the basic 
data necessary to categorise patients on the treatments administered and 
also allows possibility of checking any violations from the allocated 
treatments of the protocol. It is important to stress that although 
this trial was initiated at an early time with respect ro randomosed 
trials, the concept of standardised treatment is clear and the inform- 
ation that was collected for assessing the diversity in terms of 
surgery and radiotherapy indicates good conformity with the protocol. 

Form 3, Anniversary Record; all follow-up information was 
envisaged to be recorded on this form. Initially the major concern on 
the follow-up information in the protocol was that of devising a 
procedure by which all patients may be seen at a specific follow-up 
clinic, However it was requested that on radiotherapy case records 
information/ 
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information on oedema of the arm, limitations of the shoulder movement 
chest pain and dyspnea due to post-irradiation pulmonary fibrosis 
and post-irradication skin atrophy should be recorded, for the purpose 
of a retrospective assessment of their occurrence. On the actual 
diagnosis of recurrence of disease it was requested that information 
should be provided for site, date of appearance of metastasis and the 
subsequent treatment. Finally on the anniversary form the cause and 
time of death is also recorded. 

Form 4, pathology report form; this form keeps the 
information on size of tumour and the number of nodes found to be 
involved. 

Initially 1099 patients were randomised according to one 
of the two trial options. Of these number 512 were found to have 
benign disease and so were withdrawn and so thereafter no data was 
collected on them for the purpose of the trial. The remaining 587 
who had histological proof of carcinoma, were formed of 273 patients 
treated by simple mastectomy and X-ray therapy and 288 patients 
treated by radical surgery and 26 ineligibles. 

In an initial analysis of the data, 87 cases who had breast 
cancer were withdrawn for reasons of violations of the protocol. Such 
violations included - case not belonging to proposed protocol population, 
case having ineligible form of malignancy and protocol violation due 
to inappropriate treatment. Decisions in regard to trial violation 
were made by a trial committee and it was decided to exclude all such 
cases from the final analysis. However in a review analysis of this 
data/ 
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data some of the follow-up concepts were altered and the number of 
eligible patients for analysis was increased to 273 for simple mastectomy 
and radiotherapy and 238 for radical mastectomy. The main reason for 
this increase in the numbers of eligible patients for analysis was 
the introduction of a policy of comparison of patients according to 
the treatment allocated rather than the treatment performed. Table 
(6.3.1) gives details of the relevant reasons for the exclusion or 
inclusion of the original deviants and other cases. 

Another area that at the time of the review of the 
follow-up data implied slight changes in the form of concepts adopted 
was in dealing with the assessment of response due to the treatment 
following the recurrence of the disease and general concepts such as 
local and metastatic disease. For this particular trial it is import- 

ant to consider response to the treatment in terms of the delay in 
the development of the local disease. Similarly it is of interest to 
consider disease fEee survival and the time to metastatic recurrence. 
In terms of times aEter the recurrence of the disease it is generally 
expected that the treatment will not effect the survival of the patients 
a great deal after the detection of metastatic recurrence. 

For the general recurrence categories the position of 
contralateral disease classification had been reviewed. In the past all 
secondary tumours were considered to be a metastatic recurrence, however 
with the new review some had been classed as new malignancies. There- 
fore it seems important that for future collection of any trial data, 
allowance must be made for the possibility of changing definitions. 
It seems that in general all categorisation relating to time such as 
response/ 
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response or a duration of an interval, will eventually be indexed 
in more detail in terms of length of duration, extent and form. It 
may well be the case that metastatic disease will eventually be looked 
at in terms of extent and duration. This is going to become more 
common as multiple failure analysis becomes more common. In the past 
success or failure of surgical treatment has been assessed mainly by 
survival. With the new drug treatments assessment of response to 
disease and the detection of recurrence is playing an increasingly 
important role. 

In the last section the data was described. In so far as 
prognostic information is concerned, the data is held on the initial 
examination form. Later in this chapter cross tabulations of different 
prognostic factors will be presented. It must be noted for some of 
the factors with a continuous scale, it may be desirable to categorise 
such variables. Age of the patient is such an example by which it is 
possible to split the population into different groups and then study 
survival distribution for each category. 

For the events after treatment that may contribute to the 
understanding of the disease treatment process, there are 4 major events 
that we consider as important. These are local recurrence, metastatic 
recurrence, death and the last follow-up date. Clearly these events 
can produce in combination a large number of measurable periods. Using 
These periods it may be useful to study time to a particular stage 
of the development of the disease or it may be possible to stratify sub- 
groups of patients according to some prior event. For example, one 
can stratify the population according to time to local disease and 
observe/ 
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observe the distribution of time from local disease to death. 
Treatments according to option dr awn . 
Treatment according to protocol 
Correct treatment minor option modification* 

Randomised therapy deviations. * 
Immediate XRT given though not indicated 
Surgery only - diad before XRT 
Incorrect oophorectomy 
Wrong surgery 

E ntered for this analysis . 

Legitimate withdrawals after randomisation 

Benign disease 
Ineligible but malgnant 

Total patients randomise d 578 521 1099 

* A detailed survival study according to malignant withdrawals and 
exclusions will deviate from the general course of study, -in 
terms of conclusions they do not effect the overall results. 



RMX 


SMX+XRT 


Total 


257 


243 


500 


6 


14 


20 


4 


1 


5 





4 


4 


7 


9 


16 


14 


2 


16 


288 


273 


561 


277 


235 


512 


13 


13 


26 



Table (6.3.1) 



A general discussion on the methods of construction of the 
likelihood functions for a stratified analysis of the data is given 
in chapter 7. Figure (6.3.5) represents a possible path for a 
progression of disease described in the above paragraph 
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Figure (6.3.5) 



In the first instance for the analysis of the data we consider cross 
tabulations of various categories. Then we study the failure 
distribution of the population in terms of survival times. For the 
estimation of the important poognostic factors we use Cox's proportional 
hazard model later in this chapter. In chapter 7 we consider time 
dependency of various prognostic indicators with different functional 
forms of time dependency. Using the same data later in chapter 7 we 
consider the effect of multiple events present in the time scale with 
the use of sem-markov hazard models. 

6.4 Initial analysis with cross tabulation tables. 



A good preliminary study of the data can be performed by 
a set of cross tabulations. The value of the Pearson chi-square can 
indicate a possible association between the distributions of the two 
factors. At this stage we are only trying to assess whether the 
data is distributed according to expectations of the previous studies. 
Appendix A presents important cross tabulations for the prognostic fact- 
ors/ 
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ors. The data is described by the following factors. 

1. Menopausal status; premenopausal, menopausal , post menopausal . 

2. Side of the lexion; right, left. 

3. Site of the lesion: Medical half only, lateral half only, central 
both halves, whole breast. 

4. Size stage: T1, T2, T3. 

5. Node stage: NO, N1. 

6. State stage: SI, S2, S3. (stratifying factor) 

7. Skin involvement: not relevant. T1 , T2, T3. 

8. Pectoral muscle involvement. Not relevant, T1, T3. 

9. Treatment option: Radical mastectomy, Simple mastectomy and Radio- 
therapy. 

10. Disease status: Local S metastatic recurrence, Metastatic recurrence, 
Local recurrence, None. 

11. State: alive, dead. 

Most of the above factors (1 to 6) are related to prognostic state of 
the patient. Skin and pectoral muscle involvement refer to extent and 
site of early developments of the disease. Disease status and state 
finally refer to indicators of the progression of the disease at time 
of last follow-up. The option which defines the treatment allocated is 
also looked at for assessing distribution of the prognostic factors. 

The first tables we will consider are the set (A.I) within 
Appendix A. As may be expected the largest number of patients are post 
menopausal (395). There are 38 menopausal and 163 pre-menopausal 
patients. when we consider the distribution of the 3 categories of 
menopausal states against other prognostic factors there are no statis- 
tically significant associations (except for age) . The most signifi- 
cant value is for node status, with X 2 = 3.0, 2 d.f. giving the 
probability/ 
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probability value of 0.22, which is not significant but indicates more 
node palpability with pre menopausal patients. X 2 = 4.9, 4 d.f.and 
p = 0.29 is obtained for T stage, indicating smaller tumours for pre 
menopausal patients and larger tumours for post menopausal patients. 

There are 284 right side main lesions and 276 left side lesions. 
Side of the lesion is not an important factor in defining a patient even 
when we consider site of the main lesion categories. The most signifi- 
cant association with side is for T stage with X 2 = 3.3, 2 d.f. and 
P = 0.18, which is not significant. Site of the lesion has been 
categorised in a way that basically indicates the size of the tumour. 
There are 286 patients with their lateral half involved, 183 with medial 
half involved and 67, 22 and 2 with central, both or whole breast invol- 
ved by the tumour respectively. 

The T stage in fact give X 2 = 17.7, 8 d.f. and p = 0.02 
and implies T3 (larger tumours) with central and both halves involved. 
Smaller tumours correspond with the medial half or lateral half alone 
involved. 



Node status gives X 2 = 7.6, 4 d.f., p = 0.10 giving more 
node positive patients with central or both halves involved. (or perhaps 
basically with larger tumours) 



Stage of the disease was defined to be a combination of 
the T stage and the Node status and the following tables clearly 
indicate this:- 
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Node State 



NO 



T. Stage 

T1 

35 



T2 
273 





N1 


21 


124 




Total 


56 


392 






S Stage 








S1 


S2 


T stage 


T1 


5 


17 




T2 


272 


124 




T3 





c 




Total 


307 


141 






S Stage 




Node state 




SI 


S2 




NO 


307 







NT 





141 




Total 


307 


141 



T3 
67 

40 

107 



S3 
4 
1 

107 
112 



S3 
68 
44 
112 



Total 
375 

185 

560 



Total 
56 
397 
107 
560 



Total 
375 
185 
560 



T stage with node state cross tabulation gives a X 2 = 2.0 with 2 d.f. 
and p = .37 



As presented in tables (a.1), stage is significantly associated with 
site. Stage 1 patients (good prognosis) more commonly have medial 
tumours or laterals half alone involved, which reflects the previously 
mentioned association with size. 



Now/ 
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Now we will consider the distribution of the prognostic 
indicators with the two arms of the trial. It must be reiterated 
that treatments were allocated before malignancy was diagnosed and 
therefore some of the patients were later removed for trial purposes 
since they had benign tumours. The total number of radical surgery 
patients is 288 and the simple mastectomy and radiotherapy patients 
are 272. The treatment options were also stratified according to age 
and the stage of the disease, but again the benign disease exclusions 
could affect this balance. Table (6.5.1) in fact shows that in most 
respects a good balance between the treatment groups resulted from the 
randomisation, with only slightly more T1 patients being allocated to 
the radical mastectomy group. 

The next set of categories which were studied by the cross 

tabulations were the disease progress indicators and the spread of the 

initial tumour. In here we must emphasise that the disease indicator 

such as progression of the disease and final state of the patients will 

be studied more extensively in the next chapter. The present method 

of considering the cross tabulations does not allow an independent 

2 

survival and censoring analysis and the X values reported should not 
be interpreted as representing value of a treatment at this stage. 
These tables are presented within section (A. 2) of the Appendix A. 

Menopausal status shows a degree of association with the 
spread of the initial tumour indicators, in terms of skin and pectoral 
muscle involvement. The pre-menopausal patients have a lower level 
of skin and pectoral muscle involvement, X 2 = 17.5, 8 d.f.p=0.02 
at/ 
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at X 2 a 11.1, 6 d.f., p = 0.08 respectively. The status of the 
patients at the end of the study indicates X 2 = 25.8 with 2 d.f. and 
p< 0.0001 giving much better survival for the younger patients. 
Surprisingly the disease recurrence does not reach a significance with 
the present method, giving = = 0.11 with less local or metastatic 
disease among younger patients. 



Side of the lesion does not play an important role for the 

final or initial disease progress. The highest value for the side is 

2 

by skin involvement, X = 3.6, 4 d.f., p =.46. Categorisation by site 

however plays an important role for the pectoral and skin involvement. 

2 2 
X = 27.8, 16 d.f., p = 0.03 and X = 24.4, 12 d.f., p = 0.02 respect- 
ively for skin and pectoral muscle involvement. As we mentioned 
previously site is a reflection of the size of the tumour and in 
general medial or lateral halves involved produce less skin and pectoral 
involvement than other sites. The same pattern appears with the disease 
progression. More local or metastatic recurrence is noticed with both 
halves or central area tumours, X^ = 21.0, 12d.f . p = 0.06. Follow- 
ing the above, lateral and medial half only, produce best number of 
2 

survivors X = 8.9, 4 d.f., p = 0.06. 



T1 patients described earlier are a better prognostic 
group. By definition they have less initial skin and pectoral muscle 
involvement and finally less local or metastatic recurrence and there- 
fore are better survivors. 

T stage of the Tumour 

X 2 d.f. p 

Final disease condition 19.74 6 .0031 

Final survival status 14.14 2 .0009 
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Node status does not produce any association with skin and 
pectoral muscle involvement. However with the node negative patients 
there is less recurrence of the disease at the end of the study, x 2 =22.7 
3 d.f., p< 0.0001. Further node negative patients are better survivors 
than the node positive patients 

SI cases are taken to be a good prognosis group and Stage 2 
and 3 respectively worse. This is true both fob the survival number and 
for the number of recurrences. stage 1 groups give the highest 
proportion of disease free survivors X 2 = 27 . 6 , 6 d . f . , p = o . 000 1 , 
and a better number of final survivors, X 2 = 7.05, 2 d.f., p = .03. 
For the pectoral and skin involvement there is a defined relation 
between S stage and involvement 

In the above discussion prognostic factors that indicate a 
significant skin involvement also indicate a pectoral muscle involvement. 
The two are very closely related and often coincide. However pectoral 
skin involvement does not have a significant association with disease 
recurrence, X 2 = 13.8, 12, d.f., p = .31 and X 2 = 12.1, 9 d.f., p = .21 
respectively. Nor do skin and pectoral involvement show a significant 
association with the status at the end of the study, x 2 = 4.29, 4 d.f., 
P = 0.36 and X 2 = 5.51, 3 d.f., and p = 0.13 respectively 

Disease recurrence and final status of the patients are very 
closely related as expected with metastatic recurrence producing a 
larger portion of dead cases. Treatment option and disease progress 
will be studied in later sections. In so far as the numberical distrib- 
utions are concerned, treatment option is not associated with the skin 
pectoral/ 



295 



pectoral involvement or disease progress. Disease recurrence and option 

give X 2 = 1.83 with 3 d.f. and p = .60. However final status of the 

2 

patients seems to be related to option, X = 8.2, 1 d.f., p.= 0.004. 

So far the description of the data has been concerned with 
sets of categorical variables. The picture emerging is that T stage, 
S stage, menopausal status, node involvement and treatment options 
are factors producing the major associations with the categories of 
final disease status and survival status. Menopausal, T stage and 
S stage are related in effect to two important continuous variables 
namely age for menopausal status and size for T stage and therefore 
S stage. It seems proper to look at the distribution of these 
variables. Table (6.4.1) gives the means and the standard deviations 
of age and size, for all the 561 cases-, menopausal status is one factor 
that is of course related to age. The distributions according to 
the table clearly indicate this. The size of the tumour is similarly 
related to T stage, and this is clearly shown by the table. 



Pre menopausal 
Menopausal 
Post menopausal 

Right side 
Left side 

Site medial 
Site lateral 
Site central 
Site both 

T1 
T2 
T3 

N0 
N1 

SI 
S2 
S3 

Skin involvement TO 
T1 
T2 
T3 



AGE 

M ean s.d 

44.18 4.28 

48.62 3.65 

59.91 5.85 

54.27 9.19 

54.87 8.73 

55.05 8.62 

54.08 9.23 

55.30 8.28 

55.41 10.26 

51.44 9.17 
54.77 8.93 

55.45 8.73 

55.42 8.93 

52.34 8.78 

55.22 9.03 

52.51 8.76 

55.38 8.70 

51.72 9.88 

50.47 8.30 

56.12 8.78 

56.21 8.50 



SIZE 

mean s.d. n 

3.45 1.48 163 

3.87 1.49 38 

3.62 1.40 359 

3.71 1.44 284 

3.49 1.41 276 

3.46 1.25 183 

3.51 1.39 286 

3.11 1.57 67 

4. £8 1 .86 22 

1.64 .70 56 

3.51 1.01 397 
4.95 1.68 107 

3.52 1.38 375 
3.75 1.54 186 

3.26 1.14 307 

3.39 1.080 141 

4.79 1.85 112 

3.33 .58 2 

4.83 2.48 12 

4.91 1.89 70 

4.028 1.50 36 
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AGE SIZE 
Mean s.d. Mean s.d. n 

Not involved 54.31 9.00 3.32 1.15 440 



Pectoral muscle involvement T0 57.73 7.70 3.75 1.98 14 

T1 54.94 8.01 5.21 1.84 63 

T3 55.97 8.91 4.05 1.51 43 

Not involved 54.32 9.03 3.32 1.17 440 



Radical MX 
Simple MX + XRT 

L + M Recurrence 
M Recurrence 
L Recurrence 
No Recurrence 

Alive 
Dead 



54.18 0.12 

54.97 8.77 

54.47 8.74 

55.40 8.21 

55.62 7.62 

54.14 9.43 

52.59 8.88 

56.33 8.66 



3.59 1.49 288 

3.60 1.38 273 

3.98 1.44 114 

3.89 1.45 141 

3.81 1.28 16 

3.29 1.37 290 

3.31 1.39 265 

3.85 1.43 296 



TABLE (6.4.1) 
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6.5 Survival time analysis of the old Edinburgh Trial data . 

At randomisation the patients were stratified according to 
age and clinical stage of the disease. The table (6.5.1) indicates 
a balanced distribution of patients to the treatment options within 
each stratum. A comparison of the number of patients allocated to 
each treatment by the year of entry, also gives an almost uniform 
pattern of the accrual of the patients. There is a slight deviation 
for some years. However the reason is that the treatments were 
allocated prior to histology and so some patients were allocacted a 
treatment while they were non malignant and so had to be excluded from 
the trial. 

An unstratified comparison of the survival of radical mastect- 

2 

omy patients and simple mastectomy patients gives a log rank, X value 
of 10.04 with 1 d.f. which is highly signficant (p ■ 0.0015). Figure 
(2.5.1) gives a plot of the Kaplan and Meier survival probabilities of 
the two crude survival times. 

Further for each strata a separate analysis of the survival 
times is performed. Certain of the subgroups indicate a highly 
significant difference between the survival probabilities. 
Table (2.5.2) refers to a summary of the analysis of the various strata 
using the logrank test. Generally speaking the treatment effect is 
consistent for the various subgroups with the most significant differ- 
ences being indicated by the subgroups with larger number of patients. 
One interesting pattern that emerges, however, from the survival 
distributions is indicated in the survival plots of node status, 
Age/ 
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age, T stage and menopausal status with respect to survival time, 
Figures (2.5.1) to (2.5.12). The crude hazard rates of treatments 
showed proportional rates of failure for the two groups, Figure (2.5.2). 
This pattern is not so clear once we look at the subgroups of T stage 
age and menstrual status. The survival patterns can be explored 
further by a plot of the hazard rates. Clearly the plots indicate 
that depending on the time of observations of each subgroup the rate 
of failure is slightly different. At this stage it is not possible 
to explore this point further and assess the sifnificance of such a 
hypothesis, but only to observe it. In later chapters more relevant 
questions with more advanced statistical methods may be asked. These 
methods will be based on the validation of the proportional hazard 
assumption. Meanwhile the methods of the present chapter are 
based on the assumption of proportional hazards. One important 
point to note is that we have so far only stated slight differences 
in the significance levels of the different parametric survival 
families and the various ncn-parametric tests, as fitted to our data. 
We have not considered tests of the model assumptions in order to 
attach a significance level to the model differences. 

The analysis so far, presented in table (2.5.2) indicates a 
poorer survival for all patients treated with simple mastectomy and 
radiotherapy. A categorisation according to stage indicates a 
significant difference in the same direction for the stage 1 patients 
and not a major difference between radical mastectomy and simple 
mastectomy and radiotherapy, for the stage two and stage three patients. 
By the age categorisation indicators, patients less than 50 year old 
do not show a significant difference between the two treatments, while 
older/ 



ENTRY R x Mx S Mx + XRT Total 

1964 43 41 84 

1965 61 48 109 

1966 41 60 101 

1967 36 36 72 

1968 38 30 68 

1969 33 35 68 

1970 28 21 49 

1971 8 2 10 



288 273 56! 



Age 54.2-9.2 155 - 8.8 54.6 - 9.0 

Size 3.6 - 1.5 3.6 - 1.3 3.6 - 1.4 

T1 37 19 56 

T2 198 199 397 

T3 53 54 107 



N0 199 179 378 

N1 88 93 181 

51 164 143 307 

52 67 74 141 

53 57 55 112 

Pre 89 74 163 

Meno 21 17 38 

Post 178 181 359 



TABLE (6.5.1) 
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older patients (greater than 50 years old) seem to benefit from a 
radical surgery treatment. In so far as menopausal status is concern- 
ed , post menopausal patients benefit from a radical mastectomy treat- 
ment, Node positive patients with radical mastectomy show an improved 
survival while node negative patients treated by radical mastectomy 
or simple mastectomy and radiotherapy show similar survival patterns. 

6.6 Analysis of the data using the Cox's orono rt ional hazard, ^ 

From the previous section there are certain points that we notice. 
One is that for certain covariates the relative hazard rate is dependent 
not only on the covariate understudy but also the time at which the 
covariate is looked at. That is there seems to be a suggestion that 
the effects of some covariates are not uniformly the same for the 
subgroups but are time dependent. There is also a slight form of 
inconsistency in the manner in which treatments effect patients with 
different prognostic status. 

The effects described above are basically different forms of 
interaction that may be present in our data. The first set describes 
a possible interaction between time and a covariate while the latter 
describes an interaction between the two covariates. Although we have 
intoduced the idea of interaction in here we are not implying that 
the interaction is statistically significant and the difference between 
the significance levels in various strata may be attributed purely to 
the sample sizes of each subgroup. it is a point we will examine 
in later parts of this section in more detail. 



The/ 
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The model we are concerned with is of the form 
X(t^Z) = X Q (t) Exp ( 3 1 Z 1 + e 2 Z 2 +S 12 f(Z 1 *2 )) (6.6.1) 

The above is a 3 parameter model representing a Z^, Z covariate 
interaction. Alternatively a time dependent covariate model may 
be represented by 
UtjZ) = A Q (t) Exp ( S 1 Z 1 + 6 2 f(Z lf t) + . . . ) (6.6.2) 

Depending on the form of the variable under study, the number of 
covariates will vary and we may end up with more than 3 and 2 covari- 
ates in the above models respectively. For example T stage represents 
size of the tumour and is composed of 3 categories. For a represent- 
ation of such a variable we require two variables say Z ^ and Z^ giving 

Z 1 = 0, Z 2 = For T1 

Z 1 = 1 , Z 2 = For T2 
and Z 1 = 0, Z 2 = 1 For T3 
Using the above parameter isations we can test the significance of 
T stage values as a prognostic indicator without making assumptions 
on the order level of the categories. An alternative approach would 
be to allow a variable Z with values -1 , and 1 , to indicate a linear 
categorisation of the T stage values. 

In general the numerical values attached to the quantit- 
ative covariates should not be a major problem. For the example of 
T staging there may be a slight loss of efficiency with the latter 
approach if there is a difference in the pattern of the influence of 
the size. On the other hand using two covariates for removing the 
effects of size in the former description with Z^ and Z 2 is less 
convenient and time consuming if the effect of size is uniformly the 
same/ 
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sa me on survival. For a broad purpose of exploratory analysis 
of the data thoughtful parameterisation allows the possibility of 
study of a large number of independent variables without introducing 
a- .large number of covariates. 

Initially we are only interested in the main effects of 
the described parameters in exploring the variability of the failure 
time from randomisation to time of death. However later we will study 
other failure times and intervening events. The method we adopt to 
explore the data is in some ways similar to what is usually termed as 
a stepwise regression method, by which certain levels of introducing 
variables is adopted for inserting cov.riate estimates into the model. 
We set the limits to be probability value of 0.100 for entry and 
probability level of 0.150 for removal. At each step we estimate 
all relevant parameters and consider the parameters that are significant 
and introduce only the most significant into the model. At each stage, 
if a parameter estimator that is already in the model becomes non 
significant (because of its association with variables added to the 
model) , we will remove the newly non-significant effects. We will 
deviate fro,, the above approach in our exploratory approach by 
considering certain strata variabilities separately. Furt her unlike 
the initial stages where we will study the parameters in relation to 
main effects of covariate only, in the next stage we will consider models 
of the form with main effects and a corresponding prognostic and treat- 
ment interaction term. One point to note is that at any time we mention 
size and age covariates in this chapter, we will use a normalised trans- 
formation of the effect by letting, 

Z ij » (Z ij ~ mean (Z ■) ) / S.D. (Z .) 

where/ 
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where Z., refers to a covariate for a oatient, Mean (Z.) and S.D.fZ.) 
refer to mean and standard deviation of a particular covariate for 
all patients. 

In the first stage of the stepwise procedure both age and 
menopausal status are highly significant (p < 0.0001), with the 
menopausal status being a slightly stronger prognostic factor than 
the actual age parameterisation. 

We thus continue with a categorised analysis of the relative 
risks, due to age. In stead of considering age as a continuous variable 
we categorise the scale into younger than 50 years of age and older than 
50 years. The value of the 8 relative risks are then noted. As we 

.A* 

pointed out the relative risk value is signfiicant at a probability 
value p <0.0001,. Now in a comparison of the age effect tested by 
the two methods, the age effect as a continuous variable gives a B 
value of 0.3198 and a standard error of 0.0798, while as a categorised 
variable almost coinciding with the sectors of menopausal status it 
presents a 6 value of 0.3321 and the standard error of 0.0745. We 
perform an analysis by stratifying the data into premenopausal and post- 
menopausal groups. The two groups are then analysed by assessing the 
age effect on them separately. The analysis indicates that the age 
effect reduces to insignificant levels. Later we will consider the 
age effect with time dependent parameters so that rather than 
categorise the age variability we may obtain a similarly flexible 
indication by a parametric function of the age effect. 



Following/ 
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Following the age and menopausal status of the patients the 
most direct prognostic factors are the actual size (p < 0.0001), T stage 
(P = .0004) Node status (p = .0007), treatment option (p = .0016) and 
S stage (p = .0024, . Clearly treatment is a significant effect and 
is of special importance to our study. At this stage we continue 
with the stepwise regression as described. Later we will consider 
forcing the treatment effect in the first step so that we may check 
consistancy of the model in a situation where treatment effect has 
a priori precedence. Site of the disease seems to play a marginal 

role only due to the lateral half involvement (p = .11) 

Before dispensing totally with the various sites of disease 
indicators, we consider a stratified analysis for each of the different 
sites of the initial tumour, we perform a stratified analysis based on 
each single site as defined in the section 6.4, and the set of 
covariate effects that have been considered significant up until 
now. Without presenting too much detail once again age and 
menopausal status play the most important role in defining the 
survival rates. The relative risk rates are closely related for 
each of the strata and there is an indication that the age and 
menstrual status effects are consistently in a similar direction 
within the various sites. 



The rest of the covariates we have been interested in 
at this stage for this particular failure time do. not reach a signifi- 
cant level. The covariates that we will ignore for the rest of the 
analysis of this particular event are, side of the initial lesion, 
other/ 
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other sites of lesion and patient conception of the time from 
first noticing the tumour to the time of the operation. 

The most direct prognostic indicator is the menopausal status 
of the patient. We introduce this variable into the general model 
of the Cox's approach. 

3 = 0.03558 S.E. = 0.0720 

men 

with this model the pattern of the significance of the remaining 
prognostic factor changes to some extent. Actual size remains the 
most important factor in describing the remaining variability in 
survival (P< 0.0001). Node status becomes more significant 
(P = .0002) than the T categorisation of the size of tumour (P = .0008) . 
The S stage is still significant at (P = .0018) and finally for this 
stage of the stepwise regression, the treatment option produces a 
significant contribution with P = 0.0029. 

The menopausal variable clearly removes the contributions 
of actual age in explaining survival totally, with the significance 
value reduced to, P = .42. The premenopausal patients are generally 
better survivors than the menopausal patients or post-menopausal patients. 

We can also state that larger tumours are an indication of 
a worst survival. A stratified analysis based on node status 
indicates that this statement is true for both node negative and pos- 
itive patients. In terms of menopausal status a stratified analysis 
of menopausal status with covariate analysis of size indicates that 
the/ 
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introducing/ 



introducing size of the initial mmt , sig„if icant „ alue of , ^ 
reduces to,. 0.60. Also the stage signifi=ance reduoes to p , _ 25 
The only covariates which stiil show significant levels are node status 
and treatment options with P . 0.0004 and P . 0.00,7 respectively. 

In the next step we introduce the node status oovariate 
6 men = °- 386, > S.E. 0.0729 

8 site * °-' 981 S.E. . 0.0399 

B N * °- 4315 S.E. » 0.1205 

Hode negative patients are a better prognostic group than the node 
positive patients. After introducing the 3 mejor prognostic rectors 
namely, menopausal status, actual sise or the tumour and node histology, 
the only factor regaining that still shows . significant contribution 
to survival is the trial option, p . 0.00,7. with stratified analysis 
we study the two effects that do not show any significance, namely T 
stage and the S stage of cases, to ma*e sure that the reason for 
this loss of significance is net one to the assumptions of the 
proportional hazards. The T stag, is only a function of the sise and 
this is reflected in the stratified analysis of T stage in the way 
in which the variability in survival due to the sis, effect reduces 
to insignificant levels. Por menopausal status we obtain similar , 
estimators of .34. .38 and .39 on the relative ris k factor within 
T1, T2 and T3 strata respectively. 



By the definition of the S stage, node status has a direct 
role in defining the staging system. We not e that the 3 strata of 



stage/ 
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stage give similar directions of association for the menopausal status 
and size of the tumour in terms of survival time. 

It is also important to note that the significance of different 
prognostic factors generally vary with the introduction of different 
terms into the model. This is due to their inter-relationship. 
However, the significance of the treatment option becore introducing 
any term in our model is P =0.0016 and after introducing 4 terms the 
significance is P - 0.0017 which is very close to the original value 
reflecting the similarity of the treatment groups with respect to the 
distribution of other covariates. 



The coefficients of the model after the introduction of the 
option indicators are:- 

8 men = °' 3847 S.E. = 0.0730 

6 size = °- 2040 S.E. = 0.0401 

% = °' 4319 S.E. = 0.^206 

Option = °- 365 6 S.E. = 0.1169 

The two treatment options are radical mastectomy and simple mastectomy 
with radiotherapy, and the model indicates that patients may benefit 
from radical mastectomy in terms of their survival. 

We finally perform additional analysis in order to be certain 
that the general final model is representative for the subgroups and 
categories that it represents. The main aim would be to check on the 
possibility of the extistence of smaller subgroups showing a pattern of 
survival/ 



survival rates which is different in direction to the general model. 
We perform an analysis based on a stratified analysis of each category 
of a covariate with respect to other covariates. At this stage we 
ignore treatment option but later do formal tests on them. Altogether 
there is a slight deviation based on the sample size of each covariate 
set, however it is noted that the stratified analysis does not suggest 
that there exists a subgroup with a significantly different suggestion 
of prognostic value in the opposite direction to the general model. 

Up until now in the study of the relative effects of the 
prognostic factors we have introduced the covariates according to their 
level of significance. Study of the treatment option however is 
the major objective of a trial. Now in the initial step of the 
categorising of the patient population we introduce the treatment 
option. Further for each main effect prognostic factor we introduce 
a set of first order interaction covariates that act multiplicatively 
between option and the other prognostic factors in terms of the model 
(6.6.1). It is important to note that if our intention at this stage 
was purely a study of the interaction effects, then we would have 
continued with inserting interaction effects into the above 4 covariate 
models. However as an alternative to the previous stepwise procedure, 
we introduce the option effect in the first step to study behaviour 
of the different models and also check the consistency of the final 
model. We use a generalisation of the model (6.6.1) as, 

X(t,Z) = X Q (t) Exp ( B 1 Z 1 + 3 2 Z 2 + B 12 Z 1 Z 2 + . . .) 

where S 12 is introduced for assessing an interaction between treatment 
and/ 
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and a prognostic effect. The two variables size and age are continuous 
and may be time dependent. We will deal with these models in Chapter 7. 

In what follows in this section we will consider functional 
forms of f(z r 3 2 , from (6.6.1). m the case of binary categorisat- 
ion of a variable a functional form of Z^ • Z % . Z 2 is sufficient 
(as we have used Q, 1 to indicate the two levels,) so long as enough 
consideration has been given to ease of interpretation. In fact most 
of the variables we will be considering are of the above binary form. 
The continuous variables like age and size can also be transformed to 
binary categorisations by considering the high and low levels of their 
scale and indipendcnt dummy variables. Later in this chapter we will 
consider continuous form of size and age variables. In these conditions 
a continuous parametric representation may be useful. we will later 
consider a possible functional form of age and size in the presence 
of binary treatment effect. Namely models of the form 

Mt, Z) = X Q (t) Exp ( * 8 2 Z 2 * 6 3 Z 3 + &22 Z 2 2 + 6 33 z2 J 

and 

Mt,Z) = \ (t) Exp (B 1 Z 1 ♦ 6 2 2 2 * fi.,Z, + 6 4 Z 2 Z 3 VZ7T^, ) 



where ^ refers to treatment, Z 2 to age and Z 3 to size. 



Initially we consider categorised variables with binary 
interaction effects, - z f . Z £ tat the different subgroups. We 

introduce the effect of treatment option into the total sample. The 
main effect of option is therefore represented in the relative risk 
function by 



option = 0.3677 



S.E. = 0.1168 
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Now the overall variance due to other prognostic ma in effects 
increases and so. their significance value is reduced. However their 
relative significance does not change to the one prior to introducing 
treatment option. 

Menopausal status (p < 0.0001) and age <P<0.000„ are the 
most important factors M l« by s i 2 e (P <„.„„„„ , , stag , (p , o. 0008) 
»ode status (, . 0.0009, , and s stags » . .00«4> . Sites of the 
tumour that nave lateral «, involvement ate again only finally 
significant „ . 0.06, . st this stage prior to introducing other 
main effects is not possible to try to interpret the value of the 
interaction effect parameters. » must he noted that interaction 
effects prior to introduction of the main effects do often show a 
significance, with a probability value slightly lower than that of 
the corresponding main affects. 



An explanantion is in order in regard to the value of such 
interaction effects. The main reason being due to the fact that the 
variability due to the main effect is not reeved yet. W e further note 
that the significance of probability level of the interaction para^rs 
«e lower than their main effects. The major variability is due 'to 
the main effect of age at ,P < 0.000,, and its interaction effect 
* ■ "-002,. I„ terms of 3 estimators of the relative ris* function 
we also note less significant values for the interaction effect while 
the actual magnitude for the direction of the effect is always positive 
and at a lower level. This pattern implies that the only type of 
interaction effect that we may expect to find will have a positive 
multiplicative/ 



multiplicative effect. 



Once again the major variability is due to age and its 
interaction effect with fjj = 0.0042 and 6 = 0.0042 respectively, 
closely followed by menstrual status main effect at 8 = 0.3492 and 
its interaction effect 8 = 0.3022. It seems reasonable to add further 
prognostic factors that show a significant probability value. We thus 
introduce the main effects into the model one by one depending on 
their relative significance at each stage. First we introduce meno- 
pausal status with option. 

6 option = 0.3475 S.E. = 0.1168 

6 = 0.3492 S.E. = 0.0720 

men. 

By this covariate all variability due to age is again also explained. 
The interaction effect for age option and menopausal option also become 
insignificant. Size (P < 0.0001), N (P = 0.0002), T (P = 0.0014) 
S (P = 0.0026) are all significatn. Site with lateral half tumours 
also increase in significance (P = 0.028). Now we introduce the size 
of the tumour. 

8 = 0.3650 

option 

6 = 0.3598 

men 

S = 0.2045 

size 

The only remaining factor that makes a significant contribution is 
node histology. We also note that by introducing main effects of 

the prognostic factors the interaction effects are also explained. 
Hence finally we obtain the same models with the same prognostic factors 
as/ 



S.E. = 0.1169 
S.E. = 0.0723 
S.E. = 0.0406 



as the previous approach, since none of the interaction effects have 
contributed substantially to the explanation of the survival rates. 

8 = 0.3847 

men 

6 ■ = 0.2040 

size 

6 N = 0.4319 

6 „, = 0.3656 
option 

In the above discussions the general conclusion is that menopausal 
status, Size of the initial tumour and node histology are the main 
prognostic factors that define a survival time for a group of patients. 
However, size with the T stage and menstrual status with age also 
show a high level of dependence and introducing one factor generally 
compensates for the information due to the other factors. The same 
may be said for the stage of the disease. Stage is a combination of 
the node and size categories. However it seems that a better assess- 
ment may be made by introducing node and size separately. In fact 
on considering a model of the form with menstrual status, treatment 
option and the effect of S stage represented by two covariate indicators 
we obtain 

3 = 0.3921 

men t 

8 51 = 0.3952 

8 52 = 0.3161 

B„ t . = 0.3721 
Option 

There is not a major difference noted for the new option and meno- 
pausal status estimator. Me further introduce 2 interaction effects 
of option with S gi] and s go parameters and they do not reach 
a/ 



S.E. = 0.0730 
S.E. = 0.0401 
S.E. = 0.1206 

S.E. = 0.1169 



S.E. = 0.0782 

S.E. = 0.0281 

S.E. = 0.9791 

S.E. = 0.1291 
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a significant level. In terras of interpretation the previous 
model was probably more straight forward than the present approach 
since by the latter, one must always refer back to the interpre- 
tation of S1 and S2, while the model of size and node give a clearer 
interpretation. 



Now we use a method which is alternative to the stepup proced- 
ure and is usually termed as a step down procedure. It is again a 
study of the relative significance of eaah factor when other factors 
are present. We begin with fitting a model to the data in which 
all the prognostic factors have been introduced. In the consequent 
steps we remove the effects one by one depending on the level of 
significance, that the particular estimator contributes in regard 
to defining the variability of the data. As before we will deviate 
from the standard procedure however by looking at different strata 
at each stage. In the stepdown procedure we will only consider the 
main effects, since up until now there has not been a major inconsist- 
ency in the direction of the effects. The probability levels we 
adopt are again 0.100 for re-entry of a term previously removed and 
0.150 for removal of an effect from the model. The main purpose in 
using this approach is to check on the consistency of the final model 
of the last section in being able to describe the variability in the 
survival rates. Further by an extensive comparison of different 
covariate effects we describe the improvements in the estimators of the 
treatment parameters and also the ease of interpretation for each 
prognostic effect. 



Finally the most relevant significant levels in the stepd 
procedure/ 
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procedure are the significance levels for the removal of effects from 
the model (unlike the step-up method) . As before the value of the 
significance levels for each effect changes at each step. 



The first model we consider is the full model, containing all 
covariate effects that are suspected to play a part in defining the 
survival times. The following model is hence obtained. 



men 



= 0.3231 S.E. = .1138 



B = 0.0014 S.E. = 1 183 

side » 



site 



size 



T 



0.9933 S.E. = .0731 
0.1560 S.E. = .0508 

0.3847 S.E. = .2311 



B„ = 0.6381 S.E. = .1772 
N 



0.2510 S.E. = .1538 

0.3787 S.E. = .1191 

0.0077 S.E. = .0110 

0.0008 S.E. = .0028 



option 



size 



year 



Now by reviewing each of the above terms from the model once again 
we can assess the relative importance of each factor with the above 
restrictions. Basically there is no inconsistency with the previous 
approach and the model considers the same factors as important progno- 
stic indicators. However there are slight differences in the order 
of their significance. Node histology is the most significant indicat- 
or (P = 0.0002) followed by option (P = 0.0014), size (P = 0.0016), 
menopausal/ 
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menopausal status (P = 0.0045), T(P = 0.0867) and S (P = 0.0936). 
The rest of the covariates are treated as factors contributing 
insignificant levels. These factors are respectively, lateral half 
involvement, age, symptom and side of the tumour. The value of the 
important prognostic factor seems not to change if the insignificant 
factors are removed one by one. However, value of the covariates for 
each significant prognostic factor is slightly different and generally 
the probability values are higher. The model we obtain after the 
removal of the insignificant factor is - 

6 men = °- 3788 S.E. = 0.0731 

S size = 0.1653 S.E. = 0.0505 

8 T = 0.4054 S.E. = 0.2295 

S N = 0.6521 S.E. = 0.1739 

B s = 0.2617 S.E. = 0.1529 

S option = °- 3585 S - E - = 0.1171 

Giving menopausal status ( p< 0.0001) as the most significant factor. 
Node (P = 0.0001) as the second factor, size ( P = 0.0009) and 
option (P = 0.0022) as significant, and T (P = 0.0687) and S(P=0.0912) 
as marginally significant. At a more conventional 5% significant 
level, we can also remove T and S staging. This reduces the model 
to the initial 4 covariate model. The slight indication for the 
T significance level shows that size may not have a linear effect in 
the time scale. This will be looked at more closely later in this 
section . 



Now we continue with the analysis for the assessment of the 
important/ 
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important prognostic factors for the period of randomisation time 
to the metastatic spread of the disease. The study of the time to the 
development of metastatic disease is also important in that it defines 
the spread of the disease more directly and unlike the time to death 
is not affected by factors such as death from old age, or other causes. 
Considered singly initially the major prognostic factor is the size 
of the initial tumour ( P <0.0001) followed by the closely related 
S stage (P <0.0001) N node status (p < 0.0001) and T stage (P=0.0001). 
However the menopausal status is only important after, the size effect 
with (P = 0.0001) and age is now even less significant at (P = 0.0493). 
Option is again significant although with a loss in significance. 
The site of the tumour being lateral however seems to play a more 
important role with (P = 0.014) . If we introduce the size effect 
into the model then the effects due to T stage (P =0.4) and S stage 
(P = 0.03) are reduced. The effect of node status (P = 0.0001) 
remains highly important. The relative importance of age (P = 0.11) 
and option are both reduced. Site of the tumour being in the lateral 
half is significant only marginally (P = 0.05) . 

ft . = 0.2421 S.E. = 0.0426 

size 

In the study of time to death menopausal status of the patient played 
the most important role. In the study of the variability due to 
time to metastatic disease, the most important contributions are made 
by the size of the tumour followed by the node histology. 

8 = 0.2255 S.E. = 0.0421 

w sj.ze 

8 N = 0.5025 S.E. = 0.1280 

The menopausal status cf the patients is then highly significant 
(P=0.0001)/ 
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(P = 0.0001) only after the above two variables. The menopausal status 
is then followed in significance level by age (P = 0.02), however it 
is reasonable to assume that the age effect will be explained by the 
menopausal status. Finally treatment option is then followed with 
(P = 0.04). After the introduction of size and node histology, the 
previously major contributions of T and S reduce to insignificant 
levels. This represents once again a slight deviation from the 
analysis of time to death, since the stage of the disease with node 
and size effects present was showing a marginal significance. 
Therefore perhaps it is indicative of a node status or size interaction 
within the time scale to death. 



Finally we introduce the treatment option. 

0.2234 S.E. = 0.0431 



size 



men 



0.5043 S.E. = 0.1321 

0.2967 S.E. = 0.0751 

0.2641 S.E. = 0.1432 



options 

Now, we approach the study of the response variable (time to metastatic 
disease) , with the actual treatment forced into the model. At the 
same time we are interested in the study of the effects of any possible 
prognostic factor with the option interactions that may be present. 
With only the treatment option present we obtain. 

Option = °" 2532 S ' E - = °" 1254 

Once again in consistency with previous study of time to metastatic 
disease actual size of the tumour plays the most important role, followed 
by node histology, T stage and S stage. 

o 

size/ 
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8 . = 0.2440 S.E. = 0.0426 

size 

. = 0.2638 S.E. = 0.1254 

option 

Following the introduction of the size and option effects we note that 
the interaction effects for the two variables gives P =0.9. 
Node histology also has a significant effect while size and option 
effects have both been introduced into the model. However once again 
unlike the time to death response variable menopausal status is 
relatively less significant 

8 ■ = 0.2283 S.E. = 0.0422 

size 

8 N = 0.4979 S.E. = 0.1271 

&~~*.4~. = 0.2560 S.E. = 0.1255 

option 

Again no interaction effect is noted for the node status and option. 
The most significant factor remaining is the menopausal status. With 
the entry of this later factor the major factor that influences survival 
are once again the major factors that influence time to the metastatic 
disease. 

3 = 0.2967 S.E. = 0.0751 

men 

S . = 0.2234 S.E. = 0.0431 

size 

6 N = 0.5043 S.E. = 0.1321 

Option = °- 2641 S ' E - = °- 1432 

Once again we observe the influence of possible interaction effects 
with the treatment option and again there seems to be none acting. 
The site of the tumour being lateral was a factor that for time to 
death was initially marginally significant and with the removal 
of other major factors becomes less and less significant. In the 
study/ 



study of the time to metastatic disease there seems to be a similar 
trend present, and at the end lateral half involvement is not signifi- 
cant with a probability value of P = 0.09. The next response 
variable that we study in relation to the exploratory value of the 
prognostic indicators is the time from randomisation to the local 
progression of the dieease. Once again we us* a stepwise procedure 
approach similar to the last section. One by one we introduce import- 
ant prognostic factors and observe their effects. Finally we allow 
test of interaction between the treatment main effect and the 
prognostic indicators. The final model of the relative risks para- 
meters that we obtain are similar in terms of order of importance of 
the prognostic covariates to the parameters obtained for the models of 
time to metastatic disease. However the magnitude of the various 
estimators are different. The final model is thus composed of para- 
meters 



men 



size 



N 



0.2481 S.E. = 0.0621 

0.2013 S.E. = 0.0510 

0.5518 S.E. = 0.1511 



&n~4.t «~ = 0.2421 S.E. = 0.1080 

Option 

Before we finish with this chapter which has been based on categories 
of the various variables we consider functional forms of the two 
major prognostic indicators, namely size of the initial tumour and 
age of the patient, when they are both considered to be continuous 
variables. We will initially consider a model of the form with the 
age and size and no interaction in the model .vith the treatment effect 
present. A main effect model of size and age with treatment gives 
a relative risk function with the following parameters 
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6 Option = °-3511 s.E. = 0.1172 

8 Ag e = 0-0068 S.E. = 0.0171 

6 Size = °- 19S0 S.E. = 0.0432 

in the earlier discussion in dealing with categorised size variability 
and age represented by menstrual status we concluded that there is no 
suggestion of an interaction between size of the initial tumour and 
age of patients at entry in describing the survival times. However 
we noticed a slight improvement in the treatment effect estimators of 
the 8 parameters. We will now introduce a model of the relative 
risks by which we assess the age and size effect in a continuous man- 
ner. The first model we consider is a relative risk function given 
by linear effects of age and size as well as their independent quadra- 
tic effects, giving the hazard rate 

M V Z) = V fc > +s 2 z 2 + 8 3 z 3 ♦ b 22 z 2 2 + 3 33 z 3 2 ) 

when 1 refers to treatment option, 2 to age and 3 to size, and 
giving the following parameter estimates 

B Option = 0-3578 s.E. = 0.1151 

S Age = 0.0068 s.e. = 0.0171 

6 Size = °- 1987 S.E. = 0.0451 

S 22 = 0-0003 S.E. = 0.0241 

3 33 = °- 01 70 S.E. = 0.0642 

Clearly there is no suggestion of size and age playing a quadratic 

role in the explanation of the survival times. 



Finally/ 
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Finally we consider an interaction of the continuous age 
and size. In the following m odel of the relative risk function we 
will consider an interaction of the form in which not only a multi- 
plicative effect of the prognostic indicator is present but also 
simultaneously there is an additive effect present. We then have 
a model of the form 

A(t,Z) - V t) Ex p (S(Zl + , 2 z 2 +63Z3+64V3V ___ ) 

in the above model, , refers again to treatment, 2 to age, 3 to size 
and 4 to the interaction para.eter. Furt her for the convergence 

of the maximum likelihood estimator we will transfer the actual covar- 
iate indicators so that they are almost normalised. That is we let 

Z 2 = (Age - Mean Age) / standard deviation of Age 

Z 3 = (Size - Mean size) / Standard deviation of size. 

and 



^2 + Z 3 X 10 = ^ + 10 x size - mean (age + size x 10) ) / 



Standard deviation of (Age + size x 10)]* 



We thus have 



6 Qption = °- 39 12 s.E. = 0.1018 

8 Age = 0.0067 S .E. = 0.0168 

6 Size = °- 201 2 S.E. = 0.0419 

84 = °-°003 s.E. = 0.0041 

Once again there is no suggestion of an additive with multiplicative 

relative risks of size and age interaction. it seems useful to 

consider various interactions of the prognostic effects when they are 

of / 
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of continuous form. A better approach and to some extent related 
approach is adopting time dependencies of the continuous effects. 
In Chapters 7 and 8 we will consider such time dependencies. In the 
next chapter we will also consider multiple risk approaches. At this 
stage we will only mention the relevance of the present approach to 
the multiple risk and continue with the analysis in the next chapter 
after some further methodological developments. Apart from the res- 
ponse variables looked at so far there are some other response variab- 
les that are of interest, like the time from local disease to death 
or the metastatic disease to death. For such variables we will 
require an adjustment of the initial time segment for the proper 
assessment of the treatment and covariate effects. With the 
approach we have followed up until now one can do such an adjusting 
by stratifying the response variable according to time from random- 
isation to the present event. We may then have models of the form 

X. (t,Z) = \^(t) Exp (8,2) 

Where in the above example X_. refers to hazard rate for time from 
metastatic disease to death, then j signified categories of the time 
to metastatic disease. Clearly this is an example of a situation in 
which we have multiple events. An alternative is to use time depend- 
ency as described above rather than stratification of the basic line 
hazard X_.(t). These two latter considerations make the proportional 
hazards model a flexible method for such studies. 
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CHAPTER 7 
MULTIVARIATE RISKS 

Once we confine the method of analysis to the approaches 
discussed in the previous section we may not be able to assess the 
effects of various treatments in the presence of progression, metastatic 
disease, or other forms of intervening events most efficiently. The 
main parts of this chapter deal with situations of a trial with multi- 
ple events within the time scale of study. Initially we will deal with 
models in which cases move from one state to another. In particular 
we refer to those of semimarkov models and the analysis of data in 
groups. Such models allow a quick analysis to be performed and are 
useful for an exploratory analysis. More importantly for this thesis 
they make a good conceptual shift from models of the previous chapter 
to a situation of multiple risks. Later we will deal with the 
development of the proportional hazards approach with a functional 
form of a time dependent factor for the intervening event. 

7.1 Initial developments of the methodology. 



In 1959, Bartlett in a paper on the impact of the theory 
of stochastic processes on statistics, stated, "correct specification 
of statistical problems has only become possible in terms of stochastic 
process"/ 



process". Earlier in ,950 Neyman had written a chapter on "com- 
peting risks" in his text book on statistics and probability theory. 
These methods were inferred from a relatively simple illness and death 
model. His original ideas on this work had arisen from works 
similar to those of Daniel Bernoulli which were mentioned in the 
introduction. In particular Neyman was interested in the problem of 
assessing risks of dying from breast cancer by a comparison of risk 
of dying from cancer after treatment with that of dying from other 
causes or being lost to follow-up. This method of Neyman often 
referred to as Fix-Neyman clearly differs from ordinary survival 
analysis in that, in the latter there is only one transient state 
(entry) and one obsorbing state (death) , while in the present 
context one is concerned with different causes of death, progression, 
regression and possibly other stages. 

When there are several end points present, there is a- 
general and almost traditional way of analysing the data based on 
3 assessments:- crude probability, partial crude and net probabilities 
of survival. These concepts have been used by people who have 
been studying failure time in occupational health studies or the 
epidemiological studies of chronic diseases. However comment made 
by Stormer et al (1980) expresses fully the associated problems. 
"There is now mounting evidence in the biomedical literature to 
suggest that experimental methodologies are deficient when applied 
to the investigation of chronic diseases. Chronic disease appears 
to be substantially more complex than acute disease in several respects: 
chronic disease is dynamic. It represents the long term cumulative 
effects of interactions between a host biological system and the 
surrounding/ 



surrounding environment. The environmental influences are not static 
so chronic disease acquires a time varying characteristic... it is 
possible that any combination of the above factors be influencing a 
trial to a significant extent". Although our position does not go 
along with some of the comments made in the above regarding the 
generality of the environmental effects, one aspect of the statement 
holds even within randomised trials; the fact that the complexity of 
chronic disease requires complex processes by which time varying 
characteristics may be incorporated. 



Such problems initially were related to an approach that 
was named competing risk. J. Cornfield (1957) on competing risks 
and clinical trials puts the approach in the following perspective of 
the language of cause and effect. He defines a formal effect as if 
individuals died from some extraneous cause and had no chance of 
dying from cause under study. Further empirical effects relate to 
those who died from extraneous causes and might have a probability of 
developing the disease of interest, which differs from probability of 
those who died from disease of interest. The latter effects are then 
suggested to be analgous with withdrawal at time of the analysis. 

C.L. Chiang (1964) develops the concept of probability for 
competing risks in a formal manner by defining 3 separate functions: 

(1) The crude probability of survival. 

Q i (t) = (probability that individual alive at t will fail in 

t + h from cause i in the presence, of all other risks) . 

(2) The Net probability of survival 

H i (t) = (probability that individual alive at t will fail in 
t + h if risk i was the only risk acting) . 

(3) / 
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(3) The partial crude probability of survival. 

Q. -(t) = (probability that individual alive at t will fail in 
t + h, from cause i, if cause j was eliminated as a 
cause of death) . 



In a discrete situation we may divide the time scale into 
segments and apply life table approaches. For a continuous case based 
on distributional assumptions / parametric methods may be used. C.L. 
Chang (1976) has extended the method and as an alternative approach 
has used Fix-Neyman model for the two transient states and more than 
two absorbing states. Such a model of two transient states is a 
realistic model in which different patients with separate prognostic 
values can be placed on different transient states. Individuals may 
thus move from one transient state to another until in a finite time 
they enter one of the absorbing states. An adequate form of explain- 
ing such a phenomenon would then be based on the recorded number of 
transitions and the times of the transitions between any two states. 
C.L. Chang (1979) developes this method further for the particular 
case of chronic conditions. He makes the observation that, the 
disease advances with time from mild through intermediate stage to se- 
vere to death. The cases may die in any one of these states. A 
few practical situations where the above assumptions can aid in the 
analysis are given below. Later in this chapter we will describe 
a more natural method for analysis. 
(1) Definition of stages in diabetes. 



Chemical diabetes 



Clinical diabetes 



To diabetes with 
complication 



2/ 
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(2) Progression and treatment of leukemia 



Detection 




Bone marrow 
grafting 








Remission 1 
| Progression! 

j Host versus graft disease | 



(3) Breast cancer 



Treatment] 



Recurrence, local! 
or general | 



Death 



What the above examples have in common is that the processes are always 
irreversible; this is an observation which is useful in the further 
development of the methodology. One further restriction that has 
often hindered the general use of such approaches has been that of 
robustness. It is often possible to develop a general maximum 
likelihood function for the paths of progression, however if one 
considers distributions more complicated than the exponential distrib- 
ution, the method of maximum likelihood estimations becomes unreal- ' 
istic in terms of the quantity of calculations. 

It is clear that what may be required for our form of 
problem is a model that takes care of the problem of censoring and 
uses the assumptions of irreversibility. Such an approach is suggested 
by Lagakos, Summer and Zelen (1978) by which a non-parametric method 
based on ranks of the sojourn cimes between the states is used. The 
main purpose for the use of this approach is that of analysis of the 
data with an exploratory approach and a better description of the 
semimarkov models. However later we will extend the methodology 
to the proportional hazards models in which similar tests can be 
incorporated into the functional forms of time dependency with less 
restrictive/ 
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restrictive assumptions, than the present semimarkov models. 

The semimarkov models for the partially censored data 
provide a good construct for situations when patients move from 
one state to another. We assume that there exists h state 
denoted by S f , S ^ S ? . , . S fe . S ome of these states may be 
transient states, that is one may assume that the stay in that state 
is finite. All other states are restricted to absorbing states that 
is patients after entry into this type of state will remain there until 
the end of study. Without loss of generality one assumes that the 

first states are transient and the rest are absorbing. For any case 
history we have. 



H = (V V V T 2, s 2 V s m ) n.i.r 



Where T. refers to the time of transition or sojourn between states 
S i_1 to s i' For the assumption of a semi-markov process to be true 
we must have two conditions present. One is that the next state 
of a patient will only depend on the current state and not on the 
previous state, and secondly that the sojourn timesbetween states 
are independent from each other. Therefore the length of a sojourn 
time will depend only on the adjoining states. 

We can thus define the following properties for the 
semimarkov processes in a more mathematical setting. The case 
history such as (7.1.1) in fact can be represented by the following 
terms a(i), a(i,j) and F(t,i,j) where, 

a(i) = Pr (S Q = i) , probability that the initial state is i. 
a/ 
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a(i,j) = Pr (S n+1 = j\ S n = i) , probability that the next state is 

j given the present state i. 
F(t,i,j) = Pr (T n > t \^S n _ 1 =i, S n = j), probability that the sojourn 

time between state n-1 and n exceeds t. 

Further we let F* (t,i,j) = " ^MlhH to be derivative of F with 

3 *■ 

respect to time. 

We can thus represent the probability element associated with a 
single history as 



a(S ) 



n _1 n-l n n n-1 n 



In biomedical studies we require to have an absorbing state related 
to the censoring times. We can allow such a state to exist and whithout 
loss of generality let the last state to be a censoring time represent- 
ed by (h + 1) . A case history is then represented by. 

m-1 

a(S Q ) ^ [ a(S n _ 1 , S n , P«(t n , S n _ t , SJI 

X [ a( S n _ y S n ) F' (t n , S n _ r S n ) ] U(h - 



m 



X [ ^ m .y J> F(t m , S m _ y j)] U( V h - 1) 

(7.1.2) 

where h is the last disease state, h+1 is censoring, and u(i) is set 
to zero for i< and u(i) = 1 for i > 0. 

The distribution of F(t,i,j) can take various forms for the different 
states. A simple method would be to consider an exponential distrib- 
ution based on F(t,i,j) = exp (- \ tj . This distribution however 
may be too restrictive. This choice of the distributional form of 
the/ 
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the sojourn time is the major drawback in the proper use of this 
type of method. 

A more robust precedure however may be adopted by use 
of the ranks of the sojourn times. We will present the method by 
deriving the relevant likelihood functions. Later we will expand 
the results of Lagekos , Sommer and Zelen by deriving survival 
estimators based on a predictor -corrector method. 



In order to express the (7.1.2) likelihood in terms of 

non-parametric maximum likelihood estimators a parameterisation is 

used by which survivorship function G(t, i, j) is given by 

h h 
Z a(i,j) F(t,i,j) = n G(t,i,j) 

3-1 j=i 

The full likelihood is then expressed by the above authors as 

n 1 i 

L = transient states 3 (l) . n . L ij (7.1.3) 

(i) hl 



where 

M h+1 j-1 

log L = z { I m log G(r ,i,j)f z m log G(r ,i,j] 

13 k=1 1=1+1 Llk k 1=1 llk k 1 



+ m.. k log (G(r k _ 1 ,i,j) - G(r k , i,j) ) (7.1.4) 



Where r^< ...<r M are the distinct sojourn times from the state i 

into j. m.., is the number of sojourn times from state i to j of 
J ljk 

length r^ . 1^ is number of subjects starting at state i. 

By defining P. jk = G(r k , i,j) / G(r k _ r i,j) 
so that 

G/ 



G (r k , i,j) 



k 

n p. .. 
1=1 1J1 
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(7.1.5) 



The (7.1.4.) may be rewritten as 



^ ( ( N ijk " m ijk ) l0g P ijl + m ijk l0g (1 - P ijk ] 

h+1 h+1 M 

where N. ., = z m.,, + Z Z m... 

ilk , ilk , , , , , llr 

J 1=3 1=1 r=k+1 



It follows that P. ., has the maximum likelihood estimator. 



P ijk = 1 " m ijk V.. 
J J 13k 



where P. ., = 1 if N. ., =0 
13k 13k 

And also giving 

~ j-1 « h k-1 -n 

• ■<1»3'V = (1 " P i3k» £ P ilk £ £ P ilr 

The result as presented has an intuitive appeal in that, when 
there are only two states the p ^j k estimator reduces to the analogous 
product limit estimator. In the situation of K states the results 
yield a competing risk model given by Hoel (1972). 



With the situation of multiple risks the above estimators 
can in fact be dependent on the assumption inherent in the 
reparameterisation of the survival rates as in (7.1.5). This point 
regarding the arbitrariness of the conventions in situations of more 
than two states is in fact accepted by the authors. 



The/ 
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The (7.1.4) may be made more general by reparametering 
(7.1.5) differently. We will do so with the aim of correcting the 
estimators closer to the product limit estimator in the situation of 
single risks. The major problem with adopting a maximum likelihood 
approach then would be the problems associated with the estimation. 
It is quite likely that the P. will not have a closed estimator. 

1 JK 

An alternative method would be to use a prediction corrector method 
by using a(i,j,r k )( probability of transition from state i into j 
of duration r ) . The p. may then be reparameterised according to 

(1) k (0) 1 " a(0) k m 

k 1=1 W i- J a (0) (i,j,r.) 1=1 iji 

3-1 1 

where in the above a^°* in one step of estimators are used to form 
new survival function G (1) . Clearly in the first step a (0) values 
are set to zero. The corrector part in the above model is then the 
ratio of the probability of a case not making sojourn time less than 
a particular duration (r^J from state i to j , over the probability of 
not making sojourn time less than the same duration from state i to 
any state. Such a weighing of the transition probabilities will then 
correct the original probabilities by the ratio of the units of time 
available for transition at each state for a given time. 

We will now continue with the. analysis of the data based on 
the described method. We will later plot the semi markov probabil- 
ity plots based on a single step and a three step procedure. 
In all of what follows we will be presenting transition rate 
schemes for the relevant disease states. We will not estimate 
transition/ 
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transition times to censorings since they do not have the same 
interpretation in terms of the disease. 

The 561 breast cancer patients are observed under a 
semimarkov setting. We assume there are. three transient states. 
Randomisation — > Local recurrence — ^ metastatic recurrence 



Death 

We assume no local recurrence after a general recurrence which is 
a justifiable assumption based on clinical definitions. We also 

assume that there is a further state for censored cases although there 
is no reason for presenting the probability distributions for these 
classes of patient. 

The data consists of 921 epochs of the 561 patient. All 
patients begin from state 1 (not a necessary assumption) , then all 
patients transfer from one state to the other until the history of 
abservation for a patient ends in an absorbing state ( Dead, censored) . 

Case Number Time of Sojourn Arriving state. 

1 52 2 

3 

4 

2 193 5 

192 5 



In the above sub sample of the data the first patient has "local and 
metastatic recurrence in the 52 month, and zero transition time to 
death. 

Case/ 



Case 2 has a survival for 193 months with no recurrence. 
Of the 561 patients, 105 have local recurrence.- 166 have general 
recurrence and 63 are dead after the first transition from randomisa- 
tion, giving 

Pr (Transition 1,1) = 0.29 + 0.025 
Pr (transition 1,3) = 0.50 + 0.037 
Pr (transition 1,4) = 0.21 + 0.029 

For the 105 patients with local recurrence, 89 die with metastatic 
recurrence and 7 die with no metastatic recurrence. 
Pr (transition 2,3) = 0.93 + 0.038 
Pr (transition 2,4) = 0.07 + 0.027 

255 patients have metastatic recurrence and 226 of them die. The 

rest are censored. 

Pr (transition 3,4) = 1.00 

The semimarkov appraoch introduces certain assumtpion which are 
too restrictive and to some extant unnatural for survival studies. 
We will mention these assumption here for the present analysis but 
later we will introduce non-proportionality of hazards as a good 
basis for study of the scale of survival times. The set of 
metastatic patients are in fact composed of two groups. The group 
with previous local recurrence and the group with no record of previous 
recurrence. A property of semimarkovs is that the transitions at 
any stage do not depend on the previous states and hence in this case 
we assume t hat the assessment of the progression of the disease from 
metastatic disease to death is not affected by presence or absence of 
local disease. Further in the semimarkov models time of transition 
from previous states do not play any role in the pattern of develo- 
pment of the disease at present state. Hence regardless of the time 
that/ 
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that a patient becomes metastatic , the analysis of a sojourn time 
is performed from the moment the patient enters that sojourn time 
onwards . 




^.Death 



-> Death 



We will not present the cumulative probability of transitions, that is 
conditionalprobability of transition from one state to the other exceed- 
ing a time t. 

P r (transition at t, i, j) » P{ % J t \ present state is i,next is j] 

The figure (7.1.1) to (7.1.3) show a plot of the probabilities 
against months of transition from randomisation, local and general 
recurrence. Once there is a local or general recurrence there is 
a fast progression to death, figures (7.1.2) and (7.1.3). There is 
some similarity between transition from local or general disease to 
death although the local to death set is very small. In figure 

(7.1.2) the plot of general recurrence probability appears to start 
at 0.7, the reason is due to the subgroup showing simultaneous local 
and general disease. In such cases time of transition from random- 
isation to local recurrence was recorded as a time from state 1 — »2 
and a zero transition from 2— »3. 



Although we do not emphasise a statistical test of 
the various subgroups of the data we will report the calculations 
of the transition probabilities for the two transient options of the 
trial. A formal test based on the proportionality of the hazards 
will be developed later. Figure (7.1.4) to (7.1.7) summarise such 
relationships/ 
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relationships. In terms of the interpretations of the figures 
(7.1.1) to (7.1.7) a point must be emphasised that distinguishes 
such plots from the earlier Kaplan and Meier estimators. The 
present plots are transition probabilities as they occur and therefore 
at the end of a time scale there is always a descent of the probability 
values to zero. 



We will now use the extensions of the method as described 
earlier. We will use probabilities of transition from one state to 
another with a given duration for obtaining a corrected value of the 
analogous survival times (which is the G function) . We repeat the 
method for the different transition times and in fact after three 
stages of the method the estimated values reach a value such that the 
fourth step does not contribute. The three step rates are presented 
in the same figures as the one step method, for the transition times 
from randomisation. In general we will obtain rates which are closer 
to those from the product limit estimator For the times other than 
the initial state at randomisation we will obtain values close to 
those of the one step methods and therefore are not presented. 

In general the method has a drawback in that it assumes that 
censorings are unimportant. This effect is most important in 
terms of interpretations of figures if we are considering the lowest 
levels of survival probability levels when only a few patients 
remain. The three step method on the other hand presents an 
improvement on the rates of transitions. In the next section we will 
use the proportional hazard assumption to study some of the 
events mentioned in this section. 

7.2/ 
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7.2 Time scale variability of the covariate process . 

In this section we will study the effect of treatments 
on patients in terms of secondary response variables. Relevant 
questions are, given that an event has taken place and it has been 
a progression of disease. Firstly how is each treatme.it group 
behaving and secondly how is each prognostic indicator affecting 
the disease process within each group. 

The secondary events that we have considered so far under 
the framework of the old Edinburgh trial. are related to the various 
forms of the recurrence of the disease. These results together with 
the results of the exploratory approach of the non-parametric 
likelihoods indicated a high degree of corapatability between time 
from randomisation to any secondary event such as local recurrence 
or metastatic disease. Now we will analyse the effect of covariate 
and treatment from secondary event to a later event. This form of 
analysis fits the framework of semimarkov processes, in which rates 
of transition from any state may depend on the state the subject 
is occupying. 

In the section on the construction of the overall likeli- 
hood we showed how the probability of a response may be represented 
given the previous event and censoring numbers prior to a time fc ^ . 
Now we will expand and define similar formulation . in terms of more 
than one event of interest. In any given time period we defined 
two types of events of interest. One event was named to be the 
responding/ 
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responding event of interest and the other was named a censoring. 
Now the argument may be expanded to allow various forms of 

recurrence of the disease to contribute to the partial likelihood. 
This can be achieved by allowing censoring to contain other events 
after the event of interest. Therefore for any time interval 

fc (i) " fc (i+i) we raav have s possible strata in which transitions of 
various forms are taking place. With the single risk case a full 
likelihood was represented by 



k 
n 

i=1 



Pr (individual 
dies 



Immediately last censoring 
and present death 
information 



k 

n Pr 
i=1 



(individual (i) is v number at risk after ) 



censored 



censorings and death 



The former part of the likelihood is by definition the partial 
likelihood of Cox (75) . Now by grouping the cases int s strata 
within which a particular response set. is available we write 



1 i = 1 



Pr (individual (i)\ immediately last censoring ) 



responds 



\ 



present death and transition 
information 



n k +1 

2> 1 g 

' n Pr (individual i is 



i=1 



censored 



\ number at. ri 
\censorings a 



sk after deaths ) 
and transitions 



The present development by Gail et al (1980) indicates that the 
Cox's method may be used in an analgous manner with a stratified 
analysis of the data. The strata are further defined to be a 
function of s = S (N(t), Z(t), t }. The Z(t) and t have the usual 
interpretations under a Cox's model. However N(t) represents a 
counting process by which one can define the base line hazard 
function/ 
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function to vary for the various forms of censoring or events 
depending on the time of an event. The initial recording 
event of interest is the appearance of the local disease prior 
to a metastatic development. rjnder the present study another failure 
time of interest is the time to appearance of either local or 
metastatic disease, usually termed as the disease free interval. 
As a general rule we define a three parameter function to represent 
a response variable 

R.V. (Entry, Termination, Censoring). 
For the disease free interval the function is, 

DFI (Randomisation, Local or metastatic disease, Last follow-up) 
For the progression of the local disease we may be interested in, 

(Evidence of Local disease, Metastatic recurrence or death, 
last follow-up) 

or alternatively (Evidence of local disease. Metastatic recurrence, 
last follow-up or death) . 

As we presented the hazard rates in the Chapter 2, the initial period 
after treatment show converging hazard rates for. the two treatments. 
We suspect that a similar pattern may be present for the time to 
local and metastatic spread of the disease. We therefore con- 
jecture that there may be a time dependency present and the 
proportional hazards with a time dependent covariate may be more 
suitable. That is we may obtain relative risk factors of the form 
of figure (7.2.1) The period following the above critical events 
are also of interest. That is we may be interested in time after 
local disease to death or the development of the metastatic recurr- 
ence. We define stages of the progression in the present trial 
to be R,L, M & D for randomisation, Local recurrence, Metastatic 
spread/ 
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spread and death. We also define a hazard, rate for each of the 

intervals to be X < * > , where <i> refers to the entry set and <j> 

R 

to the departure set. Thus x_ is randomisation to local recurr- 

Jj 

ence with other events as censored and ju, is the hazard for 
local to metastatic rates. This notation will produce a general 
enough terminology by which various forms of entry time and termin- 
ation time may be defined. 



Time from local recurrence to death X^ is then give by 
L L M 

( X M » X D X Q ) • Time from local recurrence to time of death or 
metastatic recurrence which ever happens first x^ Q is then a 
function of hazards of the strata (X^ , X^) . Any case not 
indicated as a member of set <i> in \ <1> is then excluded 
from the strata and cases present in <i > set and absent from 
indicator set <j> are the censored set of study. In the above 
notation we define each ^ . in terms of a time variable (t) . 
Each covariate set Z(t) would then be associated only to the set 
of <i> present at time of study. A time dependent function of 

Z(t) can include information in past history by referring to 
information in terms of events prior to <i>. The main emphasis 
of study with this appraoch is to determine separately for each 
strata the significance of a particular covariate set for a given 
response variable. This is different to a stratified analysis 
of the type described in earlier chpaters where the emphasis was 
on obtaining efficient estimators for a common 6 obtained from 
pooling information from all strata. The former approach requires 
likelihoods of the form 

X '(t, Z(t)) = X n <t) Exp ( 8 Z(t) ) 

5 US S 
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Where depending on the particular form of the response variable 6 

s 

estimator is different. The latter approach requires - 

X s (t, Z(t)) = x Qs (t) Exp ( S Z(t)) 
The last two models clearly differ in their functional form of 
| and 6 g- 

An example of the time dependent model in study of the 
randomisation to death would then be introducing a time dependent 
covariate Z(t) = if time for a single patient is prior to 
metastatic disease and Z(t) = 1 if time is after the metastatic 
disease. Basically in this approach we are affecting the 
proportional hazard rate by introducing different weights to the 
time prior to say a critical event and post critical event, for 
each fixed covariate set. 

Up until this point we have been mainly concerned with 
the type of covariate that is either fixed at the time of entry of 
patients or it has been part of an external process from time and 
the response variable. Strictly the time effect is assumed to be 

completely related to the covariate set which is fixed from the 
beginning. This form of analysis is often the most important 
and often sufficient for analysis. However, the covariate set may 
have a changing pattern in time. In this situation two different 
conditions may be of interest. First is the situation where time 
trends are present and they are due to the processes within the 
covariate of interest. An example is the situation of age of 
patients in a low mortality study. We will observe an aging 
effect and if the duration of survival is short it may be of interest 
to/ 
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to know a possible trend in ageing. Secondly, two processes may 
be intartwined. An example is study of long term survival in the 
presence of ageing, in this situation time trend may be related to 
the time scale itself and we may be interested in detecting depart- 
ures of particular type from the model, like non-proportionality of 
particular type. The latter type of time dependency forms the basis 
of an analysis in whicn we will test non-proportionality due to 
an intervening event. In these analysis we will study the random- 
isation to death time for the Edinburgh trial and consider the 
metastatic spread to be the intervening event. In the context of 
the present study of the old Edinburgh trial, we identify three 
forms of covariates. One known generally is a fixed prognostic 
attribute of the case at diagnosis. Clearly these effects are 
external to the time scale and are inherently related to each 
individual patient. An example is the effect of Node status or 
site of main lesion. These effects were generally dealt with in 
the previous section. Now we introduce the time dependency concept 
and look at some of the fixed covariates. An example although not 
part of discussion under the present framework is age of the patients 
being related to time scale. This time dependency affects the 
duration and or magnitude of the age effect. Another similar 
covariate is the size of the initial tumour and the duration of 
its effect on the survival time. Further we consider a third type, 
the stochastic type of internal covariate, in which we introduce 
time dependent covariates of a type related to duration of prior 
events. This is taken to be the effect of time to local or 
general recurrence in determining survival fHF-fcfcssr events . 
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Another example of the third type is to consider time 
dependency, related to the status of patients, where the covariate 
is inherently related to the survival process. An example of this 
effect would be allocating y*J = and y t) . 1 for times prior 
and past metastatic disease. 



in Chapter 6 we concluded that age, menopausal status, 
T, N, S and size are the important factors affecting the survival 
of patients. This pattern is consistent for both arms of the trial. 
An interesting form of analysis is then related to the effect of 
Prognostic indicators in time prior to metastatic disease and the 
significance of the effects after this event. 

At this stage we attempt to utilise the size and age 
information by a time dependent covariate. Further we deal with 
an internal stochastic time dependency by considering the level of 
progress of disease due to the appearance of local and metastatic 
recurrence, (i.e. disease free interval). m terms of the 
secondary failure times however we consider the local disease also 
to be an event of interest. This is different to the exploratory 
approach of the last chapter in which secondary failure times 
defined in combination as end points only. 

Initially we introduce a model of the form containing 
size effect only since size can have relevant time dependent pro- 
perties, size effect is initially defined to have an external 
effect on the time scale. This definition will allow a relativ, 
risk function to be estimated that projects the base line hazard 
A (t,/ 



were 



Xg(t) on to the corresponding hazard funtion Ag(t, size) only by 

a linear and constant relation of the relative risk namely Exp (6 . 

S X z 

size) . We further introduce the treatment effect by the same 

procedure and definition and produce a relative risk function. 

Exp (8 • size + ft,. . treatment) 

w size treatment 

These two models together with the treatment only model of last 
section will yield the following values for the estimators. 

RR = Exp ( 8,. treatment) 
r p treatment 

treatment = 0.3677 S.E. = 0.1 168 X 2 = 9.97 p = 0.001 

RR = Exp (6 . . size) 

S 1 Z Q 

6 . = 0.2132 S.E. = 0.0562 X 2 =22.60 p <0.0001 

size 

RR = Exp ( 8 . . size + 8 . . treatment) 
* w size treatment 

6 = 0.2271 S.E. = 0.0581 X 2 = 23.60 p <0.0001 

size 

8 = 0.3289 S.E. = 0.1253 X 2 = 7.92 p = 0.0012 

treatment 

Clearly the models indicate a better relative survival time for th 
radical surgery treatment versus simple surgery with X-ray therapy. 
The relative order implies a worst survival of order 1.44 for the 
simple surgery group. 



The size effect is also playing a consistently increasing 
role. For each two centimeters increase in the size of the 
initial tumour the relative risk increases by an order of 1.53. 
The effect of size and treatment given the present time scale is 
additive in the relative risk sense and there is no significance 
attached to the slight negative estimate of 8 for the size and 
treatment interaction. 
RR/ 
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m * m ( Size' Size + ^treatment- treatmen t + B g>t treatment. size) 
6 size = °' 2189 S - E - ■ 0.0597 X 2 = 22.61 p < 0.0001 

treatment = 0.3310 S.E. = 0.1319 X 2 = 8.43 p = 0.0010 

3 s.t. = _0 - 0521 S - E - = 0.0732 X 2 = 0.85 N.S. 
Now we define the relative effect of each covariate to be 
dependent on a transformation of the time scale. That is firstly 
introduce a time dependent factor to assess the influence of size 
over time and secondly to test for the proportionality of the hazard 
rates of the option effects. As we showed in Chapter 4 the most 
natural form of a transformation of the time scale is achieved by 
a log transformation and thus for the time dependent covariates 
we introduce a log transformation followed by a subtraction of 
near mean for normalising the variable. Therefore initially all 
time dependencies are scaled to [log (time in months) - 2] . 

■ 

First we introduce a model with the time dependency of the 
option effect. By the definitions of the proportional hazards the 
effect of time must be consistently related to the relative risk 
function regardless of the time of death. 

RR = Exp (e fcreatment . treatment + a , treatment [log ( time) -2] ) 
8 treatment = °- 3782 S - E - 0. 1 174 X 2 = 10.43 p = 0.001 

6 t =-0.0921 S.E. = 0.1131 X 2 = .7551 N.S. 

Thus there is no indication that the proportional hazard assumption 
is violated with respect to treatment. There is a slight negative 
value attached to e fc which indicates that with increasing time the 
value treatment effect diminishes and that the largest differences 
due to treatment are in the earlier part of the study. 
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and size separately. Tables (7.2.1) and (7.2.2) present inter- 
mediate values for obtaining the relative risks. 

[log (time) -c] 



size* 

2 



size* 


size 


-2 


2.5 


3 





-2 


-.6694 


-0.5509 


-0.4914 


2 








o 





4 


2 


.6694 


0.5509 


0.4914 



Table (7.2.1 



'size " Size + 3 t * Size - Ti me) 
[log (time)-cj 



size 
-2 


2 



-2 
.572 
1 

1 .953 



2.5 
.601 
1 

1 .7348 



3 

.6117 
1 

1 .63 



Table (7.2.2) Relative risk. 



At time zero, size plays its maximum role in determining 
risk of death. The relative risk is intially twice as great for th 
larger values of size compared with the cases at mean size of 2 
centimeters. Again initially the size effect for tumours of less 
than 2 centrimeters is 60% of the effect of the size effect for 
the cases with mean size of 2. The values of relative risk 
converge with time. The relative risks reach 1.6 for larger 
tumours and 0.61 for smaller tumours at the 130th month. 

Now we will consider the same form of time dependency with 
a different functional form. It seems that the previous log trans- 
formation is natural in the sense of non-propcrtionality of the 
Weiball/ 
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Weiball family. i n the present study we suspect that the effect 

of time non-proportionality for each size effect is linear. That 

is although we can hold the. view that initially the si ze effect is 

most significant, the distinction in the present function will 

be in the nature of the rate ot the decline of the hazard rates. 

Table (7.2.4) refers to the new sets of risk function that are estimated 

using the new functional form of time dependency. The actual 

relative risk is then presented as 

RR = Exp ( 6 size . size + 6 fc size [ (time/20) -2] 

S size = °- 3081 S - E - = 0-3528 X 2 = 34.05 p< 0.0001 

B t = -0.0154 S.E. = 0.0O85 X 2 = 3.37 p = 0.064 

Compared to the previous logarithmic function of time dependency the 
actual magnitude of 6^ remains close to the present estimator. 
The values of the estimator of the standard error of Sfc also cnanges 
slightly. we refer to tables (7.2.3) and (7.2.4) and figure (7.2.^ 
for a graphical representation of the relative risks 



[(time in months/20) -2] 



size* 


size 


-2 


2.5 


7 





-2 


-.6778 


-0.5392 


-0.4006 


2 














4 


2 


0.6778 


0.5392 


0.4006 



Table (7.2.1) (6 . size + 



size 



size. time) 



[(time in rr.onths/20) -2] / 



361 




n 



> 



H 

2 



3 



size* 



size 



[(time in months/20) -2] 
-2 2.5 






-2 


.5077 


.5832 


.6699 


2 





1 


1 


1 


4 


2 


1.9695 


1.7146 


1 .4927 




Table (7.2.4) 


Relative risk. 





We thus conclude that there is a slight suggestion 
that size of tumour for the long term survivors may play a less 
important role. At the early part of the time scale size has 
the maximum effect in determining risks of death. The relative 
risk is highest for the larger tumours and has a ratio of 2 : 1 
for larger tumours versus medium sized tumours. This ratio reduces 
to 1.5 : 1 for the same sizes after the passage of time in 180th 
month. One final remark is that the above conclusions are com- 
patible with models with no change over time and models with treatment 
effect included. 

7.3 Evenbfcime variability of the time scale. 



Up until now all time effects have been dealt with on 
the basis of a time scale and the covariate process. That is we 
have assumed the change in time scale to be due to an external process. 
Now we deal with covariates in a time scale within which an inter- 
nal variability is assumed. That is we may estimate the time effect 
difference for time prior to and after a critical event. Generally 
it is regarded in breast cancer that the initial treatmeant effect 
does/ 
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does not play a major role after the development of metastatic 
disease. 

We will initially develop a maximum likelihood function 
for a general approach based on a parametric method. The likelihood 
function will be used later to show how all the relevant inform- 
ation may be extracted by a particular test of the hazards. For 
the present methodology and the development of the likelihood we 
consider three separate time events. 

(1) Death without the recurrence of disease. 

(2) Time to the recurrence of the disease. 

(3) Time from recurrence of the disease to death. 
The situation is presented in Figure (7.3.1) 




Start .' „ Deafch time 



Figure (7.3.1 



We may thus expect that initially all patient groups are subject to 
risks of both recurrence and of death. We can represent the time 
of death T as 

X 2 * X 3 if 5 E 1 > x 2 

We can expand the above definitions so that censorings may also be 
included in the formulation. In here any recorded censoring may 
refer/ 
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refer to two censoring paths. One is censoring before the inter- 
vening event and death and the other before death but after the 
intervening event. In here we refer to censoring events as c. 
Figure (7.3.2) to all the possible outcomes. 



(a) t 

(b) - t 

(c) ■ c 

V 

(d) _ c 

time 

Figure (7.3.2) Types of possible observable events. 

(a) refers to an outcome for a case that has a death with no 
recurrence of the disease being recorded. The only observable 
time is therefore x 1 = t with the distribution X 1 ^ X., 

(b) refers to a case with a recurrence at time and a death 
at time t, giving = t- x 2 " In fcnis case we have X 2 < X 1. 

(c) refers to a case recurrent at time x ^ an ^ censoring at time c 
giving x >c-x . In this case once again we have a distribution 

x 2 <x r 

(d) finally, in this part we refer to individuals who are observed 
but do not show a recurrence and are alive at the end of the study. 
The distributional restriction of the case is then X^>G and X 2 >c 

In general for most situations it is justifiable to make 
assumptions on the distribution of X^Xj and X 3 , so that random 
variables have an independence. Such an approach is useful in the 
estimating/ 
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estimating part of the likelihood function with parametric 
restrictions for each of the three events. In here we consider a 
general likelihood function. Later in the discussion of the 
covariate we will reconsider the assumptions and show the use of a 
convenient test for the independence of the distributions 

In the construction of the likelihood we consider a model 
in which the two paths to death are independent. That is X 1 is 
independent of both X and K^, but the sections of the failure time 
path with recurrence namely X^ and X^ are dependent. The general 
likelihood function may then later be completed with the usual dis- 
tribution functions like the Weiball or exponential. 

We now introduce the following notations for the distribu- 
tion of X.j, X 2 and X^ given X^ has occured. X 1 has the density 
function f (X ) distribution function Fjx.^) and the survival function 
F (X| ) . X 2 has the density function 9( x 2 )' distribution function 
G(x 2 ) and the survival function G(x 2 ). Finally x 3 \ x 2 has the 
density function b x^) / the conditional distribution function 
H(x 3 \ x 2 ) and the survival function H (x^\ x 2 ) . 

The maximum likelihood function is then composed of the 
contributions of the four types of observable events (a) , (b) , (c) 
and (d) , as in figure (7.3.2). 

(a) presents the condi tional distribution of the death times given that 
there has been no recurrence of the disease. 

F^x^V X 2 ) = Pr [ M X AV X 2 ] 
=/ 
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Pr[(X,<: x^pj (X^ X 2 )] 



Pr[X^ X 2 J 



X co 
1 



Pr[X^ X 2 ] } Q J «*t J ^ dy 2 dy i 



PrfX^ X 2 J 



f(y,) G( Yl ) d yi 



The differentiation of the distribution function then gives the 
density function. 



f 1 U l\ X 1 « V = Pr [Xl \< X,] f( V 8 <V 



Thus the contribution to the likelihood from case dying at 
is 

f( *l\ X 1< }X 2 ) " PC [X 1 ^ X 2 ] = f(« 1 )'G 



(b) Gives the joint conditional distribution of recurrence and death 
after a recurrence. 

F 23 ( V X 3 \ X 2 <X 1> = Pr [(X 2 S VH < X 3 * X 3 ) \ X 2 < *j] 

= Pr [ (X 2 4 x 2 )Q(x 3 ^ x 3 ) \(X„< x T )3 

Pr[X 2 < X^ 



Pr[X 2 < X ] 



| X 3 


rj 





J J 



Pr[X 2 < V '0 '0 



f(y 1 )g(y 2 )h(y 3 \ 



dy 1 dy 2 dy 3 



y 2 ) 



X 3 A 



F(y 2 )9(y 2 )h(y 3 \y 2 )dy 2 dy. 



After differentiating the ahove distrihution factions we will 
obtain the joint conditional density function. 

3 2 3 2 ' SHvx;, ^ «,\^| oa 

Thus the =o„ trlbution to the UkeilhTOd for ^ observation ^ 
the int^enin, event tiBe ^ ^ death ^ 
recurrence is 

a the ending point „ . ^ ^ ^_ ^ 

(7.3.1) we have 



f 23 (x ,x \x < x,)dx = 1 - . 4 

c -* 2 1 3 Pr[X,< xj F <V9 ( V H(cx,\ . , 



giving a liK e l inood functio „ represe „ ted by 

and no tiM of cecurre „ ce ^ ^ 

likelihood is 



Pr[ Xl > c, x > c] = 



C ' C 



" F (c) G(c) 



Now/ 



368 



Now we define a total of n patients with n,, n„, n„ and n patients 

12 3 4 

representing the number of cases with (a) , (b) , (c) and (d) events. 
The final likelihood is then given by 

L = n f ( x ) G ( x ) 
i=1 11 11 

n l + n 2 

° i=n" + 1 F < V 9 <V h(X 3i\ V 

"1 + n 2 + n 3 _ 

" i-n ! n+ 1 F(X 2i )g( ' X 2i ) H l°r\]> * 2i > 
i=n 1 + n 2 +1 

n 

n F (c.) G(c.) 

i- n 1+ n 2+ n 3+ 1 

Now by a substitution of a particular form of a distribution form 
we will be able to estimate the relevant parameters. In here it may 
be possible to obtain a reasonable estimation proeedure for a 
constant hazard case using an exponential distribution. However 
if we adopt a more robust distribution based on the Weiball distribu- 
tion the method will become very complex. 

If we adopt a distribution with the covariate restrictions 
of the proportional hazards assumptions we will have 

M 5C 1# Z.) = u . U u ) Exp (b 1 Z.) 

As the hazard rate of the ith observation of the x i time. The 
survival function is then by the definitions of the introduction 

T (x ., Z .) = Exp[- j 11 - X Q (t) Exp( 6l Z.) dt] 

There/ 
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There is clearly a one to one correspondence between the hazards 
and the survival functions. 

i 

Based on the assumptions of the proportional hazards we 

will have 



"1 



L = n f (X ,Z.) Gf x.. ,Z.) 

i=1 11 1 11 1 



V n 2 



'. » F ( *2i'V g(X 2i' Z i ) h(X 3i\ X 2i 'V 
i=n^+1 



W n 3 

n F(x 2 .,z.) 9(- 2if z.) H(c.-x 3 \ x 2 ., 

i=n^+n^+ i 



n 

B F(c , Z ) G(c ,Z ) (7.3.2) 

i=n +n +n +1 



The above gives a good representation of the full likelihood for 
a process involving an intervening event. Now temporarily 
returning to the discussion of Chapter 4 we have. 



D SZ, (t.) n r fc i gZ. (u) 

X Q (u) e ' du} as in 

(4.4.8) 



L = n X Q (t i )e 1 1 n Exp{ - 
i=1 i=1 



D f S * Z j^ 



L = H ( [Exp{ - 
i=1 



X Q (u) 2 e du} \ (t ) S 8 Z j (t i ) 

t._ 1 j e R j e R 



Exp 



t *Z (t 



as in (4.4.11; 
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The expression (7.3.2) has been presented for a proportional hazard 
rates with fixed covariates and the probability density functions, 
f, g and the joint density function h. The expression (4.4.11) in 
fact can allow time dependent covariates within the time scale (if 
we assume non informative censoring and a further generalisation for 
the model so that Z^u) and X.(u) are assumed to be independent 
within the integration region.) The consequence is to preserve 
the proportionality of the hazards while testing the lack of fit 
by a time dependent covariate t. We in fact can have the following 
hazard rates. 

V x u' V = X( ' x u ! Ex P <Vi } 
VW = x < V E *p <Vi + s t fc i z .) 

X 3 (X 3i' Z i ) = X(X 3i> Ex P <Bi Z i + S t t 2 Z.) 



t^nd t 2 are described below and are not 
related to 4.4.8 and 4.4.11. 
In so far as a testing of covariate effects is concerned we may be 
interested in tests of non-proportionality due to either \^ or \^ 
which are assessed by functional forms of t ] or t respectively. 
Thus (7.3.2) over generalises the process for a relevant -test. We 
then have the following 2x2 table. 

Contribution of individual to likelihood 
for a day of survival. 



Test of non-proportionality 


Before recurrence 


After 


recurrence 


For time to recurrence 


fc i = 1 V 


v° 


v° 


For time after recurrence 


V fc 2 = 


v° 


V 1 
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The survival distributions can now be expresses as 

x 




X 



G (^i' Z i ) = Expl " 







2i 



- x Q (u) Exp (6^. + 6 t t 1 Z.) du] 



(x 3i\ x 2i'V =Exp[ i 




-\ Q (u) Exp(B 1 Z. + B t t 2 Z.) du] 



Note that in essence sincex Q (u) is a nuisance function an 
adjustment of the initial value of A Q (u) at each iteration should 
suffice and therefore t 1 and t 2 are otherwise essentially independent 
from the integration. t 1 is then a function of u for its initial 
value and independent of the integration by definitions of the partial 
likelihood. t 2 is conditional on t 1 and has a similar definition 
from partial likelihoods. We thus can allow adjustment of the 
time scale before or after the recurrence by introduction of a 
time dependent covariate fc^ or t 2 » using the proportional hazards 
with the Kaplon and Meier base line hazards. As in (7.3.2) the 
group 1 to n 1 and n 1 +n 2 +n ;J +1 to n are the usual contributors in the 
absence of recurrence. The relevant part of the distribution of n^ 

1 to n 1 +n 2 /and n 1 +n 2 +1 to n i +n 2 +n 3 is nOW re P resented b ¥ either t 1 
or t 2 depending on the type of test. From now we will return to the 
language of the proportional hazards model and express t^nd t 2 as 
the Z(t) covariates. 

If the assumption of the proportional hazards does not hold 
for a period of the time scale, between the two covariate subgroups the 
time dependency will be testable. 
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In the situation of non-proportional hazards the value 
of Z is allowed to change within the time scale. In this context, 
rather than assume a covariate effect is acting consistently in a 
multiplicative manner on the base line hazard, we can test the value 
of 6 in particular periods of time. A different deviation of 
from the base line hazard within the time scale can then be attributed 
to a priori important event taking place before the last follow-up. 
In order to assess the value of such an effect in application, we 
test the impact of the development of metastatic disease. We consider 
a time dependency of the above type, with Z(t) = 1 if time is after 
metastatic disease and Z(t) = if time is prior to metastatic 
disease. Thus we will have a relative risk, initially composed of, 

Exp (g Z + B t Z x Z(t) ) 
We know that the treatment plays an important role in determingin 
survival . By a rescaling and use of the assumptions of the 
proportional hazards we did not have sufficient evidence to reject 
the proportionality assumption for size or treatment. Now we test 
the assumption of proportional hazards based on the development of 
a secondary event using the above constructs and details. 

RR = Exp ( S treatment - treatment + g fc . treatment . Z(t) ) 

8^^^ . = 0.3982 S.E. = 0.1170 X 2 = 11.61 p = 0.0007 
treatment c 

2 

S fc = -0.1239 S.E. = 0.0781 X = 3.10 p = 0.0748 

Although again we do not r-?iect the proportionality assumption 
based on the development of metastatic disease, there is some indica- 
tion that treatment effect is more substantial prior to metastatic 
disease. 

The/ 
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The main relative risk under study so far has been the 
survival relative risk based on the various covariate functions. 
Earlier in this chapter we explained a method for defining different 
response variables more clearly. Now we will study time dependency 
with other response variables. This is analgous to the study of 
competing risks or multivariate failure time study. Initially we 
will concentrate on stratified analysis based on the log-rank test. 

The hazard function x T analysis indicates that there is 
not a sifnificant difference between the two arms of the trial, 
by either the log rank test or the Wilcoxon test. (Chisquared 

values 1.21 and 0.82 respectively). On plotting the survival 
curves, for both treatment groups we note quite similar rates. 
However on plot of the hazard rates there is an indication that the 
simple mastectomy group are at a slightly higher risk of developing 
local recurrence than the radical group. This effect is not signifi- 
cant although produces a relatively larger number of locally 
recurrent patients within the first three years. Figures (7.3.£) 
and( 7.3.4) . 

By considering that the local disease may be an important 
intervening effect we will continue with analysis and consider D 
and later \ . Time of local recurrence to death and local recurrence 
to metastatic disease do not show a significant difference between 
the two treatment options. The tests are performed for a stratified 
analysis as well as a pooled stratified analysis according to time 
to the development of local disease. The three strata are defined 
as in figures (7.3.5) to (7.3.8) 



The/ 
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The time from randomisation to metastatic disease indicates 

a similar pattern to that seen for the time from randomisation to 

death. The radical surgery group show lower risks with the chi- 

squared value of 4.01 and the probability value of .0457. No 

difference of statistical value is detected for the actual treatments 

past the development of metastatic disease namely for the hazards of 

* D • The tests for hazards of X 1 ^ is performed in a similar manner 

to those of ^ 

M,D 



One pattern which consistently emerges indicates a higher 

hazard rate for the simple surgery group in the initial 3 year period 

after treatment, figure (7.3.9) and (7.3.10). The above figure 

R 

conforms to the findings for the \ hazards. We thus consider the 

la 

DFI =* R 

L,M,D. 



The disease free interval is traditionally an accepted 
response variable in survival studies of breast cancer and in here 
the hazards indicate a consistent distributional structure for both 
local and metastatic periods. 

The logrank test indicates a significant difference between 

2 

the treatments in terms of the DFI . X value based on the logrank 
test is 4.08 which has the. corresponding probability value of p=0.043. 

Finally we study the response time ofX ' which is the 
time from the development of the metastatic disease or local recurr- 
ence given locals are prior to metastatic disease to the time of 
death./ 
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death. The logrank test for the treatment differences indicates 
a value of 4.85 with the significance level of 0.0277. 



The period to the appearance of the local or metastatic 
disease is further used as a stratifying variable for a comparative 
analysis of radical versus simple surgery with XRT in terms of the 
response variable with the hazard • We define three strata, 



based on DFI . 



2 

Total R No.R Dead Total S No. S. Dead. X P 



DFI^ 1 (Rvs) 22 21 28 28 9.39 .0022 

1 <DFI<3 (Rvs) 41 38 50 49 1.48 .2240 

3 <DFI (Rvs) 69 53 61 47 0.66 .4165 

Then for patients recurrent after the 1st year their having had radical 
surgery is less likely to benefit the patients. Figures (7.3.11) and 
(7.3.12). However for those recurring early there is benefit in 
terms of survival by a radical surgery. We will show later this is 
not an indication of interaction. 



We will continue the analysis by inclusion of a time 
dependent covariate related to the disease free interval, using the 
Cox's proportional hazard model. We will use the formulations which 
were presented in the early parts of this section on the intervening 
events. Now we will use such concepts to detect departures of 
specific type from the proportionality of hazards. In particular 
we are interested in the group of patients showing an early recurrence 
of the disease. We will define a function t* which is the time to the 
detection of recurrence. We first analyse the data according to 
the relative risk function of time of first recurrence to death, in 
presence/ 
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presence of treatment effects and time dependency due to the DFI . 

That is a model of the form, 
L,M 

D " Exp ( 8 treatment treatment + S t treatment (log (t*)-2)) 
A model of the treatment effect gives 

treatment = iU7S S ' E ' °' 0658 = 5.20 P = 0.0225 

which closely approximates the logrank test where no time dependency 
is included in the model. with the inclusion of a time dependent 
effect we have 

treatment = - 1487 S - E " = °- 0698 ** = 5.81 P = 0.016 

6 t = -.0720 S.E. = 0.0501 X 2 = 2.85 P = 0.0871 

There is not sufficient evidence to conclude confounding over the 
disease free interval. However as a priori one tailed test there 
is an indication of narrowing of the two treatments. 

Finally in this chapter we will develop a methodolgoy for 
a family of functions for the analysis of an intervening event in a 
clinical trial using Cox's proportional hazard model. Clearly there 
are difficulties attached to the analysis of trial data if in the 
course of progress of disease there are a few routes acting which 
differ for various patients. By a fixed covariate approach and 
the proportional hazard assumption we may do a useful analysis as 
long as there is not a crossover of the hazard rates. The 8 
estimator provides a good basis for the interpretation of data. 

One of the problems in such an analysis is that often 
the present methods of treatment may not affect the total survival 
time/ 



time but rather may lead to differing qualities of survival depending 
on the development of the progression of disease. The example of 
analysis by the semimarkov procedure gives a representation of the 
problems involved. The method we will develop in this section in 
continuation of (7.3.2) allows a formal test to be performed for 
the intervening event. By testing the rate of change to the event 
of interest prior to an intervening event and post intervening event 
for a particular treatment or subgroup it is possible to detect 
departures from the proportional hazard assumption. Much of the 
work in this area is concentrated in the actual estimation of the 
parameters. In line with the developments of the last section we 
will continue by concentrating on the functional forms of the time 
dependency. According to the previous definition we considered 
two forms of logarithmic and linear time dependency. 

Now we will develop a functional form by which we may 
study the pattern of development of risks by adjusting the rate of 
severity of the intervening event to be a function of the time scale. 
That is we have a relative risk function of the form 

Exp (Z 1 8 1 + f(Z r t a ) 8 J 
where -*<&<•. The importance of the intervening event may 
then depend on the component of time prior to and after the event. 
Figures (7.3.13) and (7.3.14) represent the two possibilities. The 
figures also present various functions of t a . There is an area 
of close overlap withint a which covers In (t) and also Exp(U for 
the value of t. In the example if we consider metastatic disease 
to be an intervening event there exist three time dependence 
variables, t m , t Q -t m and t D . The t Q represents the survival time 
within/ 
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within which all remeasur events are made. The t and t - t are 

m Dm 

periods of time that subdivide t Q . As an extension we let a represent 

a weight function for the transformation of the periods. Detection 

of the metastatic disease implies a progression of the disease for 

both treatment groups. In the comparison of the treatments however 

we expect the proportionality of the hazards to hold throughout the 

time scale. By letting a > we can test the metastatic disease 

or other intervening event progress in terms of deviation from 

proportionality. This transformation is analogous to the 

Exp (t - t ) type of time dependency, by which the longer the 
D m 

period of survival after intervention, the more risks increase. 
Figure (7.3.13) with a >0 shows a situation where there is a build 
up of high risks from intervening events. 

Alternatively we consider the time fe ffl and the test of 
the period up to tiie intervening event. A possible transformation 
as presented in figure (7.3.14) is then by a= 0, which implies 
that non proportionality due to the intervening event may be 
assumed at a constant risk previous to the detection of the inter- 
vening event. Further the transformation of < a < « is a 
situation within which cases are initially at high risks of showing 
a survival pattern more critical than the proportional hazards 
assumption, but with the passage of time the two treatment groups 
produce proportional rates. At a = 1 we have a replicate trans- 
formation of the actual time scale. What is of importance in all 
thes= transformations is the magnitude of the relative weights at 
each period of time, in comparison with the adjoining times. There- 
fore for reasons of dimensional symmetry and also a faster convergence 
of/ 
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of the Newton Raphson procedure we use the time scale. 



m 



m 



and 



D-m 



V 



/ 



We thus recall the relative risk function for the treatment only 
model in the full time scale of randomisation to death. 



= 0.3677 S.E. = 0.1 168 



X = 9.97 P = 0.001 



treatment 

We will now consider a scaling of the time from metastatic disease 

to death with aset to a value in the range 1 to 0. Previously we 

defined a between - <°to +*> . Clearly in here value of a = «* will 

— 36 

transform the measure of time dependency to zero, that is t as 
x ■*■ <*>. Given this situation in fact we will ceturn to the model 
with no time dependency included. The initial value of a we consider 
is at zero. In the earlier part of this section we derived the value 
of time dependency according to Z(t) = 1 for time after metastatic 



disease. This is in fact the same as Z(t) = t 
values are, 







treatment 



w t 



= 0.3982 S.E. = 0.1170 X = 11.61 



= -0.1239 S.E. = 0.0780 X = 3.096 



The estimated B 

p = 0.0008 
P = 0.07488 



Indicating there is no suggestion of lack of proportionality of tne 
type with a constant scale after metastatic disease. 



Now we consider a linear effect of . the metastatic disease. 

That/ 
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That is 

Z(t) = linear normalised time after metastasis, 
giving, [ (t/15) -2], where t is time after metastatic disease. Thus 
RR = Exp [e treatment .treatment + g fc treatment, time] 

treatment = °' 3855 S ' E ' = X 2 = 10.11 P = 0.002 

8 t = -0.1389 S.E. = 0.0347 X 2 = 2.69 P = 0.0956 

Referring to table (7.3.1), we present the transformations of the 

time scale for nonproportionality . Use of the various power 
transformations of the time scale is a good check on the consistency 
of the results that may be obtained. in the present context the non- 
proportionality does not show a significant deviation from the propor- 
tional hazard model; however we note that at a = 0.4, the scale of 
non proportionality is at the most efficient value. 

In fact for the present data the different power 
transformation do not influence the estimator of treatment a great 
deal. As a general conclusion the appearance of the metastatic 
disease does not influence the assumptions of the model. The final 
conclusion of the present chapter in fact conform with, the analysis 
of the Chapter 6. A point of interest howeve, is that in the analysis 

of this chapter we have not considered only one event variable but 
rather two intertwined processes through time and have concluded that 
the events through time do not influence, the conclusions of our study. 
The implication in medical terms is that the relative risks between 
the two treatments according to this data do not provide evidence of 
a difference for times prior to and post metastatic recurrence. Since 
the occurrence of metastatic recurrence is an intervening random var- 
iable/ 



393 

variable we use different transformation with aand again there is 
no suggestion of a deviation from the above finding. 
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.1148 11.76 
.0721 3.89 



Estimated value S.E. X 

.3982 .1170 11.22 

-.1239 .0780 3.18 

.3985 .1163 11.25 

-.1321 .0775 3.02 

.3992 .1158 11.43 

-.1368 .0775 3.15 

.4021 .1151 11.75 

-.1381 .0749 3.44 

.4034 
-.1411 

.4029 .1152 11.78 

- 1411 .0734 3.82 

i4018 .1159 11.45 

-..1408 .0751 3.65 

.4015 .1163 11.26 

-.1407 .0775 3.45 

,3885 .1169 10.83 

-.1401 .0809 3.02 

.3867 
-.1395 

.3855 .1181 

-. 1389 .0847 2.79 

.3842 .1201 9.83 

-.1349 .0897 2.35 

.3769 .1211 9.75 

-.1211 .0928 2- 19 



,1176 10.53 
.0825 2.94 



P 

.0008 
.0749 
.0008 
.0822 
.0007 
.0761 
.0006 
.0639 

.0006 
.0484 

.0006 
.0509 
.0007 
.0559 
.0008 
.0637 
.0010 

.0822 
.0012 
.0861 



9.97 .0016 

.095 
.0017 
.126 
.0018 
.138 



Table (7.3.1; 



395 



CHAPTER 8 
PROGNOSIS IN BREAST CANCER 

The purpose of this chapter is to evaluate the importance of 
certain prognostic indicators in a group of breast cancer patients. 
In this section however we make a distinction between indicators that 
are regularly assessed in the staging of patients and some other 
indicators that have not been considered a great deal in the past. 
The present data is related to a group of patients diagnosed as 
having breast carcinoma and referred to by H.J. Stewart et al (1968). 
We will deal later with the data and the procedures for its collection 
and the various measurements made on the patients. Before considering 
the data however, we will remark on certain important trends in the 
study of prognostic indicators and the auxiliary indicators that 
are used in the analysis. 

In this study we are not so much concerned with substantiating a 
major disease indicator but rather to consider if some of the sur- 
vival time variability of the patients may be attributed to some 
measurements outside of the usually accepted prognostic indicators. 
Thus the findings of this study may be of some value in a subsequent 
sample of patients. In the discussions of what follows we will 

refer to a number of variables. In here we will describe these 
variables and later refer to them in their short notation. Through 
the course of the discussion more necessary details and references 
to some of the variables will be given. 
Tumour/ 
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Tumour Contour , types will be discussed in greater detail further in 
the next section and figure (8.1.1) refers to the classification of 
the tumour. 

Inoperability , is also referred to in more detail in the discussion. 
Basically inoperable patients are patients who have had a spread of 
the disease to the extent that no surgical treatment is performed. 
Size, is considered to be the maximal tumour diameter of the initial 
tumour. 

Node refers to the involvement of the axillary nodes according to 
histological findings. 

Extent refers to the depth of the initial tumour. 

Grade is the histological grade of the initial tumour and is discussed 
further. 

Presence of complicated change . refers to the type of tumour where there 
is evidence of abnormal skin distant from the main tumour. These 
include thickening of skin overlying tumour, blurring of tumour outlines 
and the dilation of adjacent veins. These effects are observable by 
X-Ray. 

Tumour foci refers to two possible types of tumour, these being either 
single or multiple foci. 

Micro calcification is a method for detecting areas for histological 
examination. In here we define possible areas where tumour calcifica- 
tion had been shown . 

8.1 Methodology and sources of data . 

Two studies in the past have mentioned the value of 

tumour/ 
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tumour contour types; Ingleby et al (1960) and Lane et al (1961). 
Neither of these studies, however were concerned with assessment by 
use of a probability measure of difference between the patients. 

In the paper by H.J. Stewart two methods are discussed 
in the assessment of the tumour contour types of 157 patients. One 
is paper section and the other is mamography. Further in the paper 
they mention a few other measurements on the actual distributions 
of the contour types, such as presence of complicated disease, extent, 
tumour change etc. In the present study we will use the same data 
for assessing survival distributions for different subgroups of patients, 
in a more complete analysis of the data. 

The grouping of breast cancer patients by clinical staging 
is now a good guide to survival assessment. However in 1958 Harmer 

recognised " at least ten systems all basically the same but each 
irritatingly different from the next". The present system 

is attributed to Union International Centra Cancer and is a resultant 
system from various systems that have been used in the past. A 
single Manchester system was in ase in Britain up until 1958 
when the staging was replaced by the TNM effective mainly in Europe. 
As from 1966, a different general system was adopted in the U.S.A. 
Finally in 1973 a system was adopted by the UICC and the American 
Joint Committee on Cancer staging with the (UICC/AJCC) giving the 
present method. This system distinguishes between p re-treatment 

and post surgery findings and is based on Node histology, size 
of the tumour and metastatic status. Further for the size categories 
distinction/ 



distinction is made for tumours with fixation to underlying pectorial 
fascia, and further for Node one cases with moveable homolateral 
axillary nodes; a distinction is made for a node containing growth 
and those, with no growth.. Given these developments there is still 
an enormous variation within any single stage. This is partly due 
to the effectiveness of treatments. If the treatments were more 
effective for all patients, there would be less emphasis in 
classifying cases more precisely. However part of the problem 
in the assessment by classification is that is a crude categorising 
procedure of a complex biological process of host tumour in time, and 
is far more complex than an assessment made by a single instantaneous 
measurement for a single time. A less "subjective" assessment on 
patient tumour process and survival prediction would ideally require 
repeated measurements in time. This is however, not practicable 

in that for clinical reasons it is accepted that any diagnosis of breast 
cancer requires immediate treatment. 



From a different point of view, other studies, J.E. Devitt 
(1967) have indicated that the clinical stage of breast cancer may not be 
a measure of degree or extent of growth so much, but a measure of 
tumour biological potential and host reaction. With the present study 
we concentrate on the survival time of patients as the only response 
variable. However we will not only deal with the study of static 
prognostic indicators and a "frozen" patient resistance but rather 
we distinguish between static indicators and indicators containing 
information about changes or progression. The choice of an 
indicator as static or one containing growth is difficult in that 
almost any indicator can be considered to contain an indication of 
growth. 
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growth. As an example, a variable that we will not consider as 
time-dependent but may contain such an effect is shape of the actual 
tumour. The pattern in the growth of the tumour may be related to 
the form of body resistance to it. (See shape of tumours figure (8.1.1) 

At this point we will present example of a type of study 
where measurement over time has been of some use in the study of 
breast cancer. It is generally accepted that early treatment 

improves prognosis of patients. However there is a lack of consistant 
evidence in regard to the value of early diagnosis in the improvement 
of survival times. A study was carried out by Bloom (1965) to 
test whether a prompt diagnosis of breast cancer improves survival as 
assessed from the date of first symptoms and whether the delay between 
the appearance of the first symptom and diagnosis has become shorter 
in the recent years. This study in fact reiterated the commonly 
held view that cases with a short delay between the appearance of 
the first symptom and diagnosis have a better long term survival rate 
than those with long delays. 

In this context it may be taken that the delay is in fact 
a representation of the growth of the tumour. In the studies of 
time dependencies as in other multivariate studies, the order of 
incorporating a variable into the model is of some importance. 
Often studies of the patient classification is measured by the 
staging of the tumour. If delay is taken to be a prognostic 

indicator it is measured after staging category effects have been 
removed. Thus in the above example one problem with measuring the 

tumour development based on delay time is that it may be confounded 
with / 
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with certain other factors inherent for each individual patient. 
Thus the question may be phrased as that of assessing the value of 
delay after staging variables have been statistically removed. 

With this introduction on the types of models of interest 
we will return to the description of the data as mentioned earlier 
in this chapter. Initially we will use cross tabulations to show 

the numberical association of the. indicators between themselves and with 
the number of cases alive at the end of the study. Later we will 
use a Cox model assuming a constant relative risk throughout follow- 
up. Then we will consider the estimation of the Cox's model, 
allowing for time dependent effects of prognostic variables. 
Over 2000 mamographs have been studied from 1963 to 1967. Among 
the cases with mamograms, 306 cases had a diagnosis of first time 
breast cancer. This group has certain patients for whom the data 
is inadequate and thus 98 cases have to be removed, so that the 
remaining patients are a more defined group of patients. The 
98 cases that were excluded are largely defined by the information 
collected at the initial X-ray sessions. 53 of the patients had 
been previously treated for breast abnormality. 14 of the 

patients were initially diagnosed in wrong subgroups in terms of 
their form of malignancy . and thus were also excluded. 7 cases 
had either an unusual malignancy or had post-operative death. 
Finally 24 cases had inadequate clinical information after diagnosis 
or mamograph were of inadequate standard. 
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Number % of 306 

Previously treated 53 17.3 

Uncodable diagnostic error 14 4.6 

Atypical malignancy 3 1.0 

Post-operative death 4 1.3 

Inadequate clinical details 13 4.2 

Inadequate films 11 3.1 

Total excluded 98 32 



The 1 1 inadequate films were also taken at the beginning of the entry 
month when the technique was still being perfected. There are 
208 remaining cases who had a median follow-up time of 11^ years, 
with a range from 4 to 18 years. For this group 163 had died at 
the time of study. No cause of death was recorded for the cases 
but the actual date of death is available. Of the remaining 45 patients 
with censored survival data , 17 of the patients had attended on 
annual review to one year prior to the time of study, 21 were dismissed 
after 10 years of follow-up and 7 patients were lost izo follow-up 
with less than 10 years of follow-up. Therefore for the 208 patients 
78% have a recorded death time. The follow-up information in this 

study was mainly obtained through extraction of relevant follow-up 
information from the Cardiff clinical notes in 1981. 



Four tumour contours are defined and this definition is 
related to the type defined by Ingleby et al (1960). They define 
3 types of tumour, irregular, smooth or mixed outline. Further 
they represent better survival for smooth or circumscribed tumours. 
In the present study an additional subdivision is made. Between 
the extremes of smooth and spiculated, two categories are defined, 
namely/ 
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namely mixed tumour with well defined smooth and spiculated parts to 
their outline and conglomerate tumours which have a mulberry 
appearance macroscopically and have a blurred and irregular but not 
definitely spiculated outline on the mamograms. Further 31 or 
15% of the 208 maomgrams had evidence of malignancy but no tumours 
show. Thus there are 5 groups in all 

o «» a # 

Smooth Conglomerate Mixed Spiculated 

Figure (8.1.1) Representation of the contour types. 

The mthod of obtaining mamographs was reported in 1968 based on the 
Egan techniques. The assessment however considered only the 

first 60 patients and used the Gough and Wessthon technique of paper 
mounted thin whole breast sections. 

Several further radiological features were also recorded 
during the initial examination. For all cases size was recorded in 
millimetres. Microcalcif ication was also noted at the special 
X-ray review sessions and thus patients were categorised into 
calcification present within, on the outset or both within and outset. 
Clinical inoperability is a criterion that is not strictly 
definable clinically and thus patients with no sign of metastatic 
disease and operable tumours were recorded as operable cases and if 
metastatic disease is present or the tumour is inoperable they are 
classified as inoperable. The point about inoperable and operable 
patients is that they do represent very different groups of patients 

and in the final analysis we will distinguish between statements 
made in this regard. 

Axillary/ 
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Axillary node involvement is another well established 
clinical indicator and thus cases are grouped into node negative 
and node positive groups. Bloom and Richardson (1957) define three 

histological grades for lesions, which we use in this analysis. In 
terms of shape of tumour we distinguish between multiple and single 
foci . 

Picard J.D. (1962) has defined several well recognised 
features that can occur in the normal breast tissue around the tumour 
shadow on the mamograms of advanced primary lesion. This is termed 
as tumour showing complicated change and as thickening and straighten- 
ing of the. travecular shadows, thickening of the skin overlying 
tumour, blurring of the tumour outline and the dilation of adjacent 
veins. The above features are present on X-rays when there is 
oedema present clinically but they were also noted at mamographic 
review sessions of the data. 



Apart from complicated change, extent is also studied, by 
separation of patients into greater than and less than h inch deep 
tumours. Clinical size of the tumour and age of the patient complete 
the data. 

Variable No . Variable name Description 

1 Operability Inoperable or metastatic , operable 

2 Microcalcif ication Within, outset, both. 

3 Node Negative, positive 

4 Histological grade I, II, III 

5 Foci Single, multiple 

6 Contour Smooth, spiculatsd, mixed, 

conglomerate . 

V 
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Variable No . Variable name Description 

7- Complicated change Present, absent 

8 Extent Less than h inch deep, 

greater than h inch deep. 

9 Size 
10 Age 



The result of an interim analysis based on 157 mamograms was 
published by Stewart (1968). In conclusion no significant relationship 
between contour types and certain prognostic indicators whether consid- 
ered separately or together was obtained. However a trend was noted 
contrary to findings of Ingleby and. Gershon - Cohen (1960) and 
Lane et al (1961) suggesting a better prognosis for spiculated 
tumour and bad prognosis in smooth and also possibly mixed lesions. 
The 1968 analysis however did not deal with any of the other indicators 
that we mentioned earlier in terras of survival times. 

8 .2 Categorical distributions of the prognostic indicators . 

Initially we perform a preliminary analysis based on Cross 
tabulations of the prognostic indicators. Two well known and 
accepted indicators are node histology and the initial size of the 
tumour. The extent of the progress of the disease is important in 
so far as we have to distinguish initially between the inoperable and 
operable cases. The main group of interest are in fact the 
operable patients. However, we will discuss the distribution of the 
inoperables in the early stages of the analysis. 

■s. 

7 5%/ 
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75% of the 208 cases were treated by mastectomy but some were 
with palliative intent in the presence of clinical inoperability . 
5 clinical groups may in fact be defined. 131 cases belong to the 
accepted operable, group of interest. 15 further cases had positive 
contralateral mamograras and their survival distribution is similar to 
the operable group. All 15 had a second mastectomy from 1 to 27 years 
after the first. In contrast to these two groups, there are 3 remaining 
groups in whom both mean and the median survival times are considerably 
less. 20 patients had local but clinically inoperably tumours and a 
further 20 have been termed inoperable solely because of the detection 
of the involved supraclavicular node at mastectomy. 22 others presented 
with systemic disease comprise the final group. In terms of the 
progress of disease the patients were separated into operables and 
inoperable patients. The main aim is to consider prognostic 
indicators for the operable group. Clearly the operable patients 
contain a smaller proportion of node positive patients (43%) to 
inoperable patients with 55% node positive cases. These results 
are clearly in line with expectations that inoperable patients are 
more advanced and thus they contain a higher proportion of patients 
with axillary node, involvement. In fact operable cases as a group of 
less advanced disease patients have a higher proportion of patients 
in better prognosis groups. For the grade of the tumour the inoperable 
cases have 74% of the patients with grade 3 tumours and operable 
cases 52% grade 3 tumours. This is not surprising and only conforms 
to what is expected. (Later we will discuss the grade categories in 
more detail in more detail in relation to other categories) . 

For/ ,._ 
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For the extent of the tumour, operable cases present 
a 9% proportion with greater than h inch depth as to 32% for the 
inoperable patients. Although this result conforms to what is 
expected it is also in line with a hypothisis which assesses extent 
as a time-dependent indicator of progress. Once again the same 

conclusions are obtained when we consider complicated change. 22% 
of the operables present evidence of complicated change in the 
initial tumour as to 63% of the inoperables. 

Multiple foci tumours form a small number of patients 
altogether . We obtain 16% multiple foci group among the operables 
and a 23% multiple foci group for the inoperables. One indicator 
that does show a similar distribution for the operables and the 
inoperables is the tumour contour shapes. We will study these 
categories further in terms of survival but at this stage there is 
no evidence to link tumour contour types with those of the progress o: 
the disease. Calcification present within or on outset also is 
similarly distributed for the operable groups versus the inoperable 
group. At the end of follow-up we also note that 36% of operables 
are still alive . compared to 1% for the inoperable cases. 

Node involvement is a further accepted prognostic indicator 
in so far as this study is concerned. We will at first consider node 
involvement for the total population and in some important categories 
mention the distribution of node involvement for the subgroups of 
operable and inoperable cases. 

Two/ 



Two other variables that may indicate progression process 
are extent and complicated change. These two do not show statistic- 
ally significant associations with the node categories. Greater 
depth tumours are present in 6% of node negative and 17% of node 
positives. Complicated change amounts to 25% of node negatives 
and 31% of node positives. It is difficult in here to conclude 
what extent and positive nodes imply, but it is an indication that 
in terms of good and bad prognosis value y extent is describing something 
slightly different from that of node status. Finally for node and 
survival status at the end of follow-up, there are 30% alive patients 
with initially no nodes involved and 18% alive with nodes recorded as 
initially involved. 

For the total of grade categories there are 12%, 16% and 
17% of tumour with higher extent depth at & 1,2 and 3 grade levels 
respectively. By separating operable cases again there is not a 
major deviation from the above for each subgroup of operable and the 
inoperable. However by considering node negative patients against 
the node positive the percentage value of the above categories of 
the grade change to 18%, 12% and 9% for the node negative and 11%, 
12% and 10% for the node positives. Thus in terms of classifying 
patients into good to bad prognosis there seems to be again an 
indication that node status and extent may be defining different 
attributes of tumour progression for each grade category. It must 
be pointed out in here that the above percentages are presented purely 
to illustrate distributional patterns of subgroups of patients. In 
the next section we will present survival distributions and the rele- 
vant/ 
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vant statistical tests. 

Grade in terms of percentage of cases showing complicated 
change gives values of 18%, 31% and 38% for grades & 1,2 and 3 
respectively. This is a similar pattern for direction as that of 
extent. (Although the complicated change values are significant at 
p { 0.001 extent categories are not). By the definitions 

of extent and complicated change it is possible that they are 
explaining similar effects of. the tumour progress. Once again by 
subclassifying by the operables and the inoperables we do not obtain 
a major deviation. However for the node status the same pattern as 
that of extent emerges. That is for node negative patients and the 
respective values of grade we obtain 29%, 22% and 20% showing complicated 
change. While we obtain 31%, 30% and 30% for node positives showing 
complicated change at grades & 1 , 2 and 3 respectively. The 
conclusion from this pattern is clearly the same as that of extent of 
tumour. . However a point must be emphasised that up until now we 
have considered relative effects in terms of prognostic distributions 
and we have not dealt with the survival times. 

In terms of contour types we do not detect any interesting 
distributional patterns for the various values of grade. Grade in 
relation to multiple foci and calcification distributions give once 
again a uniform pattern. 

Calcification within and at outset together with single 
or multiple foci tumours also show no significant association with any 
of / 
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of the other recorded variables. 

The mean size of tumour is 3.25 centimetres. For the 
operables and the inoperables we do not detect a major difference . 
Size also gives a similar pattern for the node negative and the node 
positive patients. However extent and complicated change both show 
a slightly different mean value at good and bad prognostic levels of 
extent and change. For the extent of the tumour less than h inch 
we have the mean size to be 2.51 and for the extent greater than 
h inch f 4.89 as the mean sizes, (t-test, p ^0.00{) With the 
complicated change however this pattern is not represented so 
significantly. Tumours with no sign of complicated change have 
a mean size of 3.00 centimetres and tumours with complicated change 
have a mean size of 3.71 centimetres. (t-test, p ^ QP1). Once 
again there is an indication that if complicated change is playing 
any role in classifying patients it relates to a different group of 
patients than the size category classification. For various other 
factors such as contour types and tumour foci calcification a similar 
value for the mean size distribution is obtained. 

Status of patients, at the end of study indicates that 
contour type, tumour foci and calcification do not play a major role 
in determining survival of patients. Among the accepted indicator 

size, node and operability are the major indicators for determining 
survival. With regard to the present data however two other 
indicators are also of value; complicated change and the extent of 
tumour. We pointed out earlier, these two indicators refer to groups 

i 

of/ 
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of patients that are not identified by their node status, size or 
operability. Later in this chapter we will discuss those patients 
in more detail by considering survival probabilities. 

Finally we will consider the age distribution of the patient 
according to various categories. The mean age of patients is 50.2 years. 
For the operable patients we have a mean age at 50.1 years and for 
the inoperables a mean age at 50.3 years. therefore the age 
distributors are very close. Node status categories produce again 
very close mean ages with the node negative patients being a little 
older than the node positive patients. Mean age for the grade of 
the tumour are also very close to the mean value, with higher grade 
patients, slightly older than lower grade patients. For lower than 
mean size groups we obtain again that age distribution is the same 
as the larger tumours. Extent and complicated change also produce 
the same lack of age dif f ierences . In the case of foci, calcification 

and contour pattern again we observe that age distributions are very 
close to each other in terms of the mean distribution of the 
various categories. 

In the earlier part of this chapter we mentioned that some 
patients were excluded from the study. Altogether they comprise 32% 
of the 306 patients. From examination of the features of these 
excluded patients, we observe that in general the exclusions are 
uniformly distributed between the various categories of the indicators. 



8.3 Prognostic indicators according to survival time. / 
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8.3 Prognostic indicators according to survival time . 

Up until now we have considered groups o£ patients and the 
pattern by which they were formed into distinct groups. At this 
stage we deal with survival status at the end of study and the 
estimation of the survival functions for each of the distinct groups. 
The group of operable patients as my be expected have much better 
survival than the inoperable. For completeness we present the 

survival rates of the two groups , Figure (8.3.1). 

The various categories of the indicators do not suggest 
a significant difference between any of the inoperable groups. 
However, we note that some of the indicators do not affect the 
survival times of the inoperables in the same direction as that of 
operable groups. This effect is due to chance rather than adequate 
statistical evidence for a real difference. The most striking effect 
with respect to inoperables presenting a survival trend in different 
direction as that of operables is given in figure (8.3.2). By which 
the two categories of contour types with speculated tumours show a 
slightly worse survival than smooth contour types for the inoperables, 
while in comparison of the operable groups the spiculated group do 
better than the operables with smooth contours. Due to the small 
numbers of the inoperables we will leave this subgroup and concentrate 
on the operable group only. Clinically, the operable group are 
of more interest in terms of prognosis since they are composed of 
patients with less advanced diseases. 



Initially/ 
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Initially we will deal with the operable patients and the 
different categories considered independent of time. Later we will 
consider the time dependency of the various indices with their relevant 
interpretations. 

For the operable group we note that the node negative 
patients tend to have a much better survival time than the node positive 
patients. The median survival time of the node negative patients is 
in fact 8 years and 9 months against 6 years and 4 months for the 
node positive patients. Node histology is one of the well-accepted 
prognostic indicators of survival time and we thus introduce it at 
first step of producing a relative risk function of the survival times 
for the operable strata. 

The total number of patients is 122. There are 68 node 
negative patients and 54 node positive patients. At the end of the 

study there are 80 patients with recorded death times and 42 censored 
times. By use of the Cox's proportional hazard we estimate the 
corresponding relative risk functions, given by the model . 

RR = Exp( 8 j . node) 
node 

node negative =0, node positive=1 

S node = ,5319 S-E- = ,2144 x2 = 6 ' 30 p = °' 012 

Figure (8.3.3) represents the survival times for the two groups of the 
node negative and node positive patients in the operable strata. 



One/ 
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One further accepted indicator is that of size of the 
initial tumour. Size in the group of operable patients is playing 
a slightly less important role than the node histology. The prognostic 
importance of size however reaches a statistically significant level 
for the operable group. (Among the inoperable group howes/er we do 
not detect a significant level and observe that the direction of 
the prognostic value is in the opposite direction to the operables) . 
Model of the relative risk, 

RR = Exp (6 .size) 
size 

8 . = .4581 S.E. = .2171 X 2 = 4.78 p = 0.029 

size 

Figure (8.3.4) refers to survival rates for size when a split for 
over and under 3.5 cm. lesions has been made. 



The significance of size and node status however remains 

when either node or size variability is introduced in the presence 

of the other. 

RR = Exp (S . size + 6 . .node) 
size node 

B „ = .4160 S.E. = .2091 X = 4.31 p = 0.038 

node 

8 . = .3891 S.E. = .1765 X 2 = 4.26 p = 0.037 

size 

In terms of the magnitude of the direction of node and size progression 
we introduce an interaction term for the relative risk model, giving 



RR = Exp (6 . . size +6 , . node + 6 . . size. node) 
size node int. 

J , = .4271 S.E. = .2087 X 2 = 4.20 p = .040 

node 

5 . = .3881 S.E. = .1854 X 2 = 4.09 p = .043 

size 

5. = .0092 S.E. = .0426 X 2 = .0731 N.S. 

int. 
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The aoove model allows the effect of size to be different in node 
positive and node negative patients. In fact with the introduction 
of this interaction effect we do not observe any statistically signifi- 
cant improvement to the additive model of size and node. 

By a single covariate relative risk model we study the effect 
of various indicators. Microcalcif ication is marginally not significant 
at 6.2% probability level and other indicators namely, tumour foci 
and contour type are of even less significance. The indicators 
grade, extent and change , as we may expect, show statistically 
significant levels in terms of time to death of patients. The most 
significant contributor i s grade given by 

RR = Exp (6 grade . grade) 

6 grade = - 5902 S ' E - 3 -2091 x 2 = 8.02 p = 0.005 

However grade is related to size and node status. Thus after the 
introduction of node and size in fact we reduce the grade effect. 

RR = EXP ( 8 node' n ° de + Vade' grade) 



node 
grade 



3 . 
size 



= .4311 S.E. = .2109 

= .4021 S.E. = .2231 

RR = Exp 



size 

.4081 S.E. = .2051 



X = 4.16 
X 2 = 3.95 

Si2e " S grade- grade > 



p n .041 
p = 0.047 



grade 



= .4019 



S.E. = .2162 



X = 4.05 



X = 3 4 84 



p = 0.044 
p =0.050 



Next we introduce a model of size, node and grade which 
indicates very close estimators to the model of node and size, for the 
estimator/ 



estimator of node and size and an insignificant estimator for the grad 

RR = EXP ( 8 node* node + e size - Si2e + * grade" grade) 
e node = - 4408 S ' E - = - 2010 x2 = 4 - 81 P = -028 

8 size = ,4135 S ' E - = - 2081 x2 " 3 - 92 P = -047 



'grade 



= .2850 S.E. = .2557 X 2 = 1.23 N.S. 



Both extent and change produce statistically important 
relative risk patterns in terms of survival. If inserted singly, we 
obtain: 

RR = Exp ( 6 ext> -ext) (Extent < \ inch deep) = 

(Extent > k inch deep) = 1 



ext. 



= .2425 S.E. = .1162 X 2 = 4.33 p = .037 



RR - Exp <B change - change) (complicated change not indicated) = 

(complicated change indicated) = 1 

S change = * 2637 S - E ' = ' 1206 ^ = 4 - 81 P = °-028 
Extent is slightly less significant than change. However the two 
categories represent almost overlapping subgroups of patients in 
terms of their own good (0 level ) and bad (1 level) prognostic 
indicators . 



First we introduce extent & change in the presence of size 
effect. Their significance level shows little change. Once again 
extent is less significant than change. 



RR = Exp (B • Ext. + 8 . • size) 

ext. size 



6 ext/ 
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8 = .2638 S.E. = .1259 X 2 = 4.39 p = 0.036 

ext. 

8 = .3934 S.E. = .1785 X 2 = 4.63 p = 0.031 

size 

RR = Exp ( 8 . • change + 8 • • size) 

* change ' size 

8 , = .2701 S.E. = .1288 X 2 = 4.45 p = .035 

change 

8 size = .3939 S.E. = .1783 X 2 = 4.61 p = 0.030 

Now we will consider the possibility of an interaction effect between 

extent and size and further change and size. (node interactions later) 



RR = Exp(8 • Ext + 6 • • size +6 T . . ext. size) 

c ext. size Int. 

I = .2641 S.E. = .1248 X 2 = 4.51 p = .034 
ext. 

I . = .3942 S.E. = .1785 X 2 = 4.79 p = .028 
size 

I = .0176 S.E. = .0712 X = .059 N.S. 



Int. 

RR.. = Exp (8 . .change + 8 . • size f 6 T . .change. size) 
c change size Int. 

8 , = .2694 S.E. = .1285 X 2 = 4.39 D = .036 

change 

2 

8 . = .3941 S.E. = .1780 X = 4.64 p = .031 

size 

8 T = .0211 S.E. = .0619 X 2 = .184 N.S. 

Int . 

Thus effect of extent and change is additive in presence of size 
and the interaction effect is not significant. 



We can now consider extent and change in the presence of 
node histology. In the previous discussions node and size were 
clearly major contributurs in defining survival time. First we 
deal with the relative risk for node and change status. 



RR = Exp (6 . . node + 8 . • change) 
node change 



risk/ 
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8 _ .4381 S.E. > .2097 X 2 = 4.71 p = .029 

node p 

g . = .2517 S.E. = .1183 X 2 = 4.61 p = .032 

change 

The extent also presents a similar pattern as that of change 

RR. = Exp ( 6 , -node + S Ext ) 
r node p ext. 

6 . = .4376 S.E. = .2081 X 2 = 4.57 p = .032 

noo.e 

S = .2480 S.E. = .1198 X 2 = 4.28 p = .038 

p ext . 

The value of change with node is significant and it is interesting 
to study the effect of size in this respect. That is we assess the 
survival variability which is unexplained in terms of node and change, 
by the introduction of size. Before doing so we study the effect of 
an interaction between change, node, and extent, node. 

RR = Exp (8 , . node + 8 , . change + B T . • change. node) 
v node change r Int. 

8 = .4392 S.E. = .1988 X 2 = 5.15 p = .023 

node 



change 



.2524 S.E. = .1182 X 2 = 4.56 p = .032 



'int. = .0896 S.E. = .0489 X 2 = 3.78 p = .052 



RR = Exp( 8 . .node + 8 .ext. + B T , .ext. node) 
c node ext. Int. 



node 



= .4366 S.E. = .2114 X 2 = 4.28 p = .038 



8 = .2593 S.E. = .1274 X 2 = 4, 17 p = .041 

w ext. 

8 T , = .0782 S.E. = .0523 X = 2.47 N.S. 

u Int. 

There is a slight indication of an interaction effect with node and 
change indicating both complicated change and node positivity together 
add extra risks for survival. However this is not of great importance 
since it may be a spurious significance. The size variability however 
has/ 



has not been included in our model. If we do introduce ^ 

size effect by the relative ris k function, none of the interactions 
remain significant. 

RR = Ex P^node- node + * change^* * * size - size, 



6 node " - 4279 S -E. = .2136 x 2 = 4.21 



P = 0.040 



'change"" 2512 S 'E- = .1215 X 2 = 4.41 p= 0.036 



size 



S.E. = .1976 X 2 = 4.51 p = Q.033 



in terms of extent no interaction effects are significant and if we 
introduce a model of size, node and extent,once again a similar pattern 
emerges as that of change. 

^ = EXP( W node -6 ext _. Ext. + 3 s . 2e . size) 
node = - 4281 S.E. = .2237 x 2 = 4 . 40 p = . 036 

3 ext. = -2«4 S.E. = . 1 268 J m 3>9Q p= ^ 

6 size = - 3927 S.E. = . 2 093 X 2 = 4 . 4 p = . 045 

The main reason for introducing the change concept has been that of 
considering an effect of tumour initial status by which some of the 
attributes in terms of external progress of tumour may be explained. 
This effect is clearly not sufficiently explained by node and size 
classification alone. 



We will now introduce the concept of time dependency in 
this section for the various prognostic indicators. One reason for 
this conceptual change of model is to study effect of node or size 

over a time scale and test how each prognostic effect may eventually 

diminish/ 
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diminish or increase over time. 

First we consider a relative risk function based on node 
histology and a time dependent function of time, given by t* = In (time) 

4.29 p = .038 
.420 N.S. 

There is not a great improvement over the overall likelihood by the 
introduction of the time dependency factor into the model. Thus we 
may consider the effect of node histology to be. static in terms of 
prognostic value. Once a patient is node positive the patient is at 
a higher risk and this risk for the individual patient in relative 
terms does not decrease or increase over the passage of time. 

Size is the second factor we study with respect to time 
dependency. 

RR = Exp (B • size + 8. • size.t*) 
size t 

B . = .4201 S.E. = .1964 X 2 = 4.32 p = .036 

size 

g = -.1291 S.E. = .0623 X 2 = 4.19 p » .040 

t 

Size effect clearly diminishes with the passage, of time. The larger 
tumour patients are at a higher risk in the early part of the diagnosis. 
However for the larger tumours that do not correspond with an early 
death, the prognostic significance of size will eventually diminish 
in terms of relative risk. 

Both/ 



RR = Exp( B i .node + ft • node. t*) 
node t 

B = .4284 S.E. = .2071 X 2 

node 

2 

B = t.0785 S.E. =.1211 X 
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Both extent and change were two other indicators that 
produced some significant contributions in explaining the survival 
variability rates. Now we consider extent and change as two time 
dependent variables. 

RR = Exp ( 6 change - change + B fc • t*. change) 

B L = .2281 S.E. = .1109 X 2 = 4.22 p = .040 

change 

2 

ft = -.0819 S.E. = .0902 X = .72 N.S. 

RR = Exp( e .ext + .t* .ext) 

ft = .2319 S.E. = .1132 X 2 = 4.24 p = .039 

ext 

8 t = -.0792 S.E. = .0876 X 2 = .96 N.S. 

Neither change nor extent contributions are affected significantly by 
the time dependent variability. In terms of interpretation we conclude 
that change and extent classify patients in the beginning and their 
effect is consistantly the same in terms of relative risk of death. 
Node histology therefore has a prognostic effect which may be inter- 
preted in a similar way to that of extent and change. 



One interesting question that may be asked is related to the 
effect of time on the magnitude of the size effect. Given that size is 
a time dependent factor, that is the sizes of tumour do not all conform 
to a single fixed relative risk function and some patients are slightly 
different type of survivors, is there a prognostic factor measureable 
and static at beginning of the study by which we may separate the time 
dependency of the size effect. Although the question and the evidence 
from the data may simply be represented by a few histograms, in terms 
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of statistical significance there are a few models all of which accord- 
ing to this data can explain the variability within the data. 
Primarily we presented a model of relative risks based on node histology, 
size, complicated change and extent. In the comparison of models 

of complicated change and extent, there is little to choose between the 
two, in so far as our study is concerned. For practical reasons however 
the extent of tumour may be an easier variable for measurement. In 
the interpretation of the time dependency of the size effect we may 
conclude that there exists a subgroup of patients in whom largeness 
of size of the tumour is sign of bad prognosis. By the passage of 
time in the survival scale however the value of size as a prognostic 
variable diminishes. 
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CHAPTER 9 
FINAL SUMMARY 

In this the final chapter, we will summarise the findings 
of the thesis. We will separate the findings into the statistical 
and medical, and allocate a section to each. In these sections we 
will present an overview of ideas which may be useful for future research. 

9.1 Overview and conclusions of the statistical results . 

Initially we identified various hazard shapes which have 
been reported in the literature. Such methods were useful for 
presenting in a descriptive manner the patterns of events in time 
scale for the different subgroups of patients. Further we discussed 
recent developments in the area of non-parametric methods and the way 
in which such methods are able to provide a flexible approach for 
classifying different non-parametric tests, which are often used in 
survival analysis (such as the logrank and Wilcoxon tests) . In the 
area of parametric methods we considered various analytical methods 
and in comparing the various assumptions of the methods with empirical 
data with subgroups, we found the methods theoretically restrictive 
but practically in terms of conclusions consistent for our data set. 
Primarily we performed the analysis of the old Edinburgh trial data 
by the different parametric and non-parametric methods purely for the 
purpose of comparing the statistical methods. In terms of conclus- 
ions we did not find any inconsistencies between any of the parametric 
and/ 



427 

and non-parametric methods. However as expected we were able to 
attribute the slight differences between the two non-parametric log- 
rank and Wilcoxon tests in the weighting attached to the events within 
the time scale of study. We defered the discussion of the difference 

between the various methods (in terms of significance levels) to later 
chapters where the concept of time dependency is more developed. 
Parallel with the above discussions we considered multivariate methods 
and how concepts such as multivariate prognostic factors and multivariate 
events may be employed in analysis. We considered efficiency and 
robustness of an approach to be two factors of extreme importance when 
dealing with the above forms of interrelationships between various 
events and prognostic factors. A method that we found suitable for 
this type of analysis, was the Cox's semi non-parametric proportional 
hazard model. 



One important aspect of the Cox's method which can provide a 
robust framework for the analysis of such data is in the manner in 
which the actual survival times are transformed into ranks. Before 
proceeding with the development of models using Cox's method, we 
presented transition rates between the various states of the old 
Edinburgh trial, using an explanatory stochastic method which 
was referred to as the non-parametric semi Markov model. Although 
the approach was considered to be informative we found the Cox's 
method more suitable in the manner by which it could provide a check 
on the model assumptions, using the information on the intervening 
events. Initially we considered the expansion of the models with 
fixed relative risks into models that have covariates with an internal 
variability within the time scale. At times we found checks on the 
assumption/ 
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assumption of consistency of a prognostic effect useful in a proper 
interpretation of the data. We refer ed to such models, models with time 
dependent prognostic effects. Alternatively an interpretation was poss- 
ible by employing the concept of intervening event within the time scale. 

We found that by the utilisation of the information on an 
important progression event such as metastatic recurrence, we were able 
to check on the consistency of the relative treatment effect for the times 
prior to and post intervening event. In general such intervening events 
are random events and we used a family of transformations of the time of the 
intervening event, in order to check on the consistency of the goodness of 
fit tests. We found that in practice such a consistency was present and 
that the proportional hazard of non-parametric type was considered quite 
suitable, (for the covariate subgroups that we dealt with in the data). 

We allocated a full chapter to the simulation methods for 
a clinical trial study. In this study we presented the small sample 
properties of the various statistical methods (in particular Cox's method) 
using simulated data. The method of simulation we adopted had a useful 
property of being able to generate increasing, decreasing and constant 
hazard rates with covariates. In fact all the generated samples belonged 
to the family of proportional hazards of the Weiball type. This property 
was found useful when we dealt with a simulation study of time dependencies. 

An important property of survival studies as discussed before 
has been that of censoring of the survival times. in developing a 
simulation/ 



-simulation method we discussed various approaches of generating 
censored survival times and adopted one. which is suitable for a trial 
data and can give a constant proportion of censored cases. Our 
initial intention in performing the simulation has been an assessment 
of the small sample properties of the Cox's method. However later 
ir. study of time dependencies with the Cox's method we also considered 
Weiball and exponential parametric methods. within these simulations 
we used a range of sample sizes, significance levels, levels of 
censoring and a range of treatment and covariate effects. m order 
to assess the power properties we constantly refered to the asymptotic 
normality and the likelihood ratio tests. 

Initially we discussed the power properties of the simple 
test of hypothesis for the treatment effect at both treatment effect 
and covariate effect set to zero. This value is clearly an 
indicator of type one error. We obtained efficiency values close 
to the expected values according to the singificance levels. m 
repeating the simulations for a range of covariate effects we note that 
the efficiency of simple tests of hypothesis for treatment is not in 
any way influenced by the value of the covariate effect. Clearly 
as we expected the efficiency of the tests do deviate to some extent 
according to the values of sample size, cansoring and significance level, 
however none of these factors affect the lack of influence of the 
covariate effect value. A point of some interest was that the 
decline in the power of the simple tests due to censoring which s< 
to be affected by the sample size to some extent, indicated a lower 
loss due to censorings for higher sample sizes. 



seems 
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Next we discussed power curves for the composite tests. 
These simulations had a change of emphasis in that they were presented 
for a more theoretical interest and we showed that the Cox's method 
has good, predictable small sample properties. At first we dealt 
with the type one error and showed that our results conform with 
the levels of the significance limit. This finding was true for both 
the asymptotic likelihood and the asmyptotic normality tests. Later 
we considered a range of treatment effect and covariate effects. We 
concluded as expected sample Size, significance Level and censoring 
levels do influence the power of tests. However the relative 
efficiency of the above factor is not influenced significantly by the 
treatment effect or covariate effect. 

In comparing the asymptotic normality and asymptotic likeli- 
hoods we note that the asymptotic normality in general is more conser- 
vative, as the treatment effect and covariate effects deviate from 
zero. Up until this point we have summarised simulation results 
when the generated samples were based on an exponential distribution. 
Next we deal with the summary of results from the Weiball distribution. 
We reported the simulations for the samples of Weiball in which the 
proportionality of hazards had not been violated. We found very close 
resemblance between the efficiency of simulations on increasing, 
decreasing and constant hazards (all other factors e.g. sample size, 
censoring being equal) . We attributed this close resemblance to the 
non-paramtric nature of the Cox's method. 



At this stage we deal with results of the Weiball distributed 

samples/ 



431 



samples in which the proportionality of the hazards was violated. 
That is, there existed a degree of time dependency for the covariate and 
treatment effects. Consistently we noted a reduction in the type one 
error less than the actual significance level. m fact we noted 
this power decreases with an increase from the proportionality of the 
hazards. For the range of treatment and covariate effects (negative 
and positive) we noted that for the non-proportionality with divergence 

with the non-proportionality with a convergence to the base line. 

in comparing the asymptotic normality to the likelihood ratio 
test we observed that the normality test is more conservative. We 
repeated the simulation for a range of sample sizes and censoring 
levels and the conclusions were consistently the same. One pattern 
which emerged was that due to non-proportionality, the power curves for 
the various composite tests did not have the symmetrxc pattern of the 
Proportional hazard situation. However, we noticed that with an 
increase in the sample size there was a reduction in the lack of 
symmetry about the covariate effect axis. 

in the final discussion of the simulation results we studied 
simple tests of hypothesis in the presence of one covariate effect. 
We generated non-proportional hazard data of the Weiball type and in 
order to analyse the data we used exponential mode', Weiball model. 
Cox's proportional hazard, Cox's stratified and Cox's time dependent 
models. At the value of treatment effect set to zero we consistently 
noted that power efficiency is close to the significance level for all 
models./ 
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models. This pattern was consistently the same for non-proportional 
hazard samples. In order to study the influence of the covariate 
effect we increased the value of the covariate effect and again 
there was little change in the type one errors. 



Next we considered non-zero values of the treatment effect 
for both non-proportional and proportional hazard samples. We 
concluded that for the analysis of the non-proportional hazard samples, 
both the stratified Cox's model and the time dependent Cox's model 
were suitable. This was true for a range of covariate effect values 
and we noted that the magnitude of the covariate effect did not 

influence power of the tests. The unsuitable models were the Cox's 
model with fixed relative risk, tne Weiball model and the exponential 
respectively, with the exponential being muchwarse than the other two. 
For these three unsuitable models we noted that the magnitude of the 
covariate effect does influence the power of the tests. In summary 
we concluded that specification of the correct model is of some 
importance, when dealing with proportional hazard models. The 
Weiball and Cox's models (fixed relative risk) are the most suitable 
for analysing proportional hazard data and the time dependent Cox's 
model and stratified Cox's model are less suitable. 

In terms of magnitude of the efficiencies we noted that 
at high sample sizes of 100 there was little to choose in general 
between the models (except exponential) while at low sample sizes of 
25 some of the problems with the specification of the wrong model were 
more apparent. 

9.2/ 
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On dealing with the applications of the various statistical 
methods we presented two data sets. Both of which dealt with the 
primary breast cancer. The first data set was refered to as the 
South East of Scotland trial data. The major objective of this 

trial was an assessment of the survival rates for a group of patients 
treated by radical surgery versus the group treated by simple surgery 
and radiotherapy. Before commencing the analysis of this data 
we discussed the important design aspects of this trial such as 
patient eligibility rules, stratification and data administration. 
This trial with regard to the magnitude of the data which was collected 
and the type of events that were expected to take place within the 
survival time of each patient, is suitable for an exploratory analysis 
of the various interrelationships such as the multivariate events and 
multiple prognostic indicators, as discussed earlier. 

As we indicated by the study of the cross-tabulations there 
was in most respects a very uniform balance between the two treatment 
groups and the various prognostic indicators. Initially we performed 
an analysis based on the conventiaonal methods. This analysis 
indicated that the radical surgery patients may have an overall higher 
survival rate compared with the group treated by simple mastectomy and 
radiotherapy. We further indicated that the significance levels of the 
treatment differences as expected was not consistently the sane for 
the different prognostic subgroups. 



At the first stages of the exploratory analysis of the data, 

we/ 
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we are interested in the study of the duration from randomisation 
to any single event of importance, namely any of death, metastatic 
recurrence, or local recurrence. In this approach we used various 
stepwise regression methods with the Cox's proportional hazard model. 
A point to note was that in the approach we considered step-up and 
step-down procedures, together with a different procedure by which 
as a priori rule, we forced the treatment effect into the model at 
first step regardless of its relative significance to the other prog- 
nostic factors. The above methods consistently yielded the same 
model indicating a better survival for the radical surgery group, and 
with the significant prognostic contributors to the model being, 
menopausal status, size of the tumour and the node status. 

In order to make sure the findings of the models were not 
dependent upon the model assumptions we used stratified analysis at 
each step of introducing a new term into the model. Once again we 
noted that the direction of the effects was consistently the same. 
In order to assess the multiplicative effect of the various indicators 
we performed tests of the interaction effects using the Cox's method 
and we did not find any evidence of interactions for the survival times 

In the next stage of the analysis we considered the time 
period from randomisation to the development, of the metastatic disease. 
We once again used the above stepwise procedures for model reduction wi 

the same set of prognostic effects. We noted consistently the final 

i 

model was the model involving, option, size, node and menopausal status 
However, we also noted slight deviations in the order of the entry of 
the various terms. This pattern is not testable at this stage 
however ,/ 



nowever, «. difference ta the order o£ ^ ortance Qf prognoauc 

effects for these two response variables u Qf importance ^ ^ 

L*« discussions when we consider time dependency ln m ^ B ^ o£ 

The analysis bMeQ on [he randoaUation ^ 
also BraeM « a consistent Mal indicating sane ^ 
main effects. 

in the sugary of the results so fat „e have confined the 
conclusions to those of analysis of randomisation time to an important 
final event. Prom this point onwards „. win su „„ arisa ^ ^ 
of the South Bast of Scotiand trial with models of multiple ris k s in 
»h lc h patients move from one state to another. 

In general « attributed the developments within the time 
scale to he due to intertwined processes. Such processes were 
combinations of epochal events such a, astatic recurrence, local 
recurrence or death. Alternatively we may have been interested in 
the assessment of cumulative ris k s over time for a single covariate. 

initially we considered an analysis based on the semi-marxov 
models . Th is approach gave an explanatory stochastic description 
of the movement of patients from one state to the other. * term3 
or presentation of the results we derived general expressions for 
the survival rates in order to obtain close approximations to the 
transition rates. MO re important however, are the results by which 
« represented the above survival rates in terms of exploratory models 
of the proportional hasard type. Using the proportional ha.erd models 
*<* time dependent oovariates, we were able to as k guestions such es ; 
given/ 
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given that an event has taken place (such as the progression of disease) 
how is the relative risk for -reatment groups and prognostic effects 
performing within the time scale. In the study of the empirical 
hazard rates we consistently refered to hazard rates in which the 
proportionality of the hazards may have been violated. We were then 
in the position of testing any possible departures from the assumptions 
of the model. 



At first we considered a test of the time dependency of 
the treatment effect, by which we concluded that there was no evidenc 
of the proportional hazard assumptions being violated with respect to 
the treatment effects. We then studied the survival rates according 
to the time dependency of the size effect by which we concluded the 
was not sufficient evidence to reject the constant relative risk 
assumption. 



Next we studied the response time of metastatic recurrence 
or local recurrence to death, given that a recurrence had already taken 
place. First we stratified the data according to the period of 
randomisation to the appearance of local or metastatic disease, in 
order to compare radical versus simple surgery in a model by which 
time to the appearance of first recurrence was controlled. The 
pattern emerging was that for patients recurrent after the first year, 
their having had radical surgery, was less likely to benefit the patient. 
However for those recurring early there was benefit in terms of 
survival by a radical surgery. 



Finally for this data we considered the analysis of the 

data/ 
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data according to the secondary events that had occurred within the 
full patient survival time. Such secondary events were taken to 
be based on local or metastatic recurrence of the disease. Initially 
we dealt with the relative risk function according to the treatment 
effect and tested the goodness of fit in order to detect departures 
from the proportional hazard assumptions of type by which the 
relative risk after the secondary event may be acting differently 
for the two treatment groups. Once again we noted that the proportional 
hazard model is quite appropriate and that all conclusions were 
consistently in line. Although the non-proportionality did not 
reach significance we noted, however, that the treatment effect was 
more substantial prior to the development of metastatic disease. 

In the previous chapter of the thesis we considered 
survival rates of a group of cancer patients in order to assess the 
importance of certain prognostic indicators. Before analysis of 
the data we made an initial distinction between indicators that are 
regularly assessed in the staging of patients and some other indicators 
that. have not been considered a great deal in the past. In general 
the variables we were interested in were; tumour contour type, 
operability, size, node, extent, grade, change, tumour foci and 
microcalcif ication. In the analysis of the data we concentrated on 
the survival time of the patients as the only response variable. 
However we did not only deal with a study of static prognostic 
factors but also studied how changes may occur in the value of 
the initial prognosis. 

At first we noted that the various indicators for the 

operables/ 
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operables and the inoperables did not necessarily indicate prognosis 
in the same direction. The prognostic indicators for the inoperables 
were not of a significant difference and we consequently confined the 
analysis to the operable patients. For the operable patients we noted 
that node negative patients had a better survival rate than the node 
positives. Further, as expected, size of the tumour conformed with 
what was expected and patients with smaller tumours showed better 
survival rates. In terms of the relative value of size and node 
we noticed tha-; the significance of either factor remained in the 
presence of the other factor. In terms of direction in the effects 
of size and node there is no evidence of an interaction. Two further 
prognostic factors that are found to have an important influence on the 
final survival pattern of the patients were the extent of the tumour 
and the presence of complicated change in the tumour. We observed 
that extent is slightly less significant than change. However, the 
two categories represented almost overlapping subgroups in terms of 
good and bad prognosis. The good prognosis being tumour with less 
than h inch depth and tumour with no evidence of complicated change. 
We performed test of interaction for the different indicators and 
found none significant. We further performed tests of time dependency 
of the indicators and found time dependency of size indicated that 
size is of more importance during early periods of survival. 

Throughout the course of this thesis we have presented the 
models with an emphasis on interrelationships. A study of such 
factors will clearly imply a check on the generalisations and the 
assumptions of the models, and may introduce a diversity of inter- 
pretations. According to the data sets that are here examined it 
has/ 



has been possible to show a consistency of results in the final 
analysis. it is however important to consider the impact of such 
methods in situations where data may be the resultant outcome of 
constantly evolving treatments and that the analysis is performed in 
a situation of widely accessible distributed computing procedures. 
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Cross tabulation tables for the prognostic factors referred to in 
section 6.4. 

Menopausal Status 

Prem Meno Post M. Total 

Node NO 107 21 247 275 

N1 56 17 112 185 



Total 



163 



38 



359 



560 



T stage 



T1 
T2 

T3 



Menopausal Status 

Prem Meno Post M. 

22 2 32 

114 26 257 

27 10 70 



Total 
56 
397 
107 



Total 



163 



38 



359 



560 



T stage 



T1 
T2 
T3 



Side 
Right 

22 
205 

57 



Left 
34 
192 
50 



Total 
56 
397 
107 



Total 



284 



276 



560 



T stage 



T1 

T2/ 



Medial 
only 

20 



Size 

Lateral 
only 

30 



Central Both Whole 
halves Breast 











Total 



56 
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Site 



T Stage 


Medial 
only 


Lateral 
only 


Central 


Both 
naives 


Whole 
Breast 


Total 


T1 


20 


30 


6 








56 


T2 


133 


203 


48 


13 





397 


T3 


30 


53 


13 


9 


2 


107 


Total 


183 


286 
Site 


67 


22 


2 


560 


S Stage 


Medial 
only 


Lateral 
only 


Central 


Both 
Halves 


Whole 
Breast 


Total 


81 


108 


161 


31 


7 





307 


S2 


44 


70 


21 


6 





141 


S3 


31 


55 


15 


9 


2 


112 


Total 


183 


286 
Site 


67 


22 


2 


560 


Node 
Status 


Medial 
only 


Lateral 
only 


Central 


Both 
Halves 


Whole 
Breast 


Total 


N0 


133 


191 


38 


12 


1 


375 


N1 


50 


95 


29 


10 


1 


185 


Total 


183 


286 


67 


22 


2 


560 
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Section A. 2 



Skin involvement 



Pectoral Muscle 
involvement 



Disease Status Total 



TO T1 T2 T3 NI TO T1 T3 NI - None L M L+M 



Meno- 
pausal 
status 

Prem 

Meno 

Post M 



5 16 7 134 



1 1 



7 26 



6 51 22 280 



8 16 8 131 



2 22 



1 38 33 287 



27 34 3 99 163 
11 7 1 19 38 
76 100 12 171 359 



Side 

Right 1 7 41 15 220 
Left 1 5 29 21 220 



9 33 22 220 
5 30 21 220 



56 73 9 146 284 
58 68 7 143 276 



Site 

Med . only 1 

Lat.only 

Central 1 

Both 

Whole 
Breast 



6 17 10 149 
5 37 16 228 



7 50 
3 13 





2 16 15 150 

7 28 23 228 

2 11 4 50 

3 7 12 

110 



32 40 4 99 103 

58 60 10 158 286 

18 22 2 25 67 

6 9 7 22 



2 







T Stage 
T1 1 
T2 
T3 1 



50 
381 



5 
9 



1 2 48 
6 1 381 



9 58 36 



56 40 11 



3 14 1 38 56 
85 90 11 211 397 
26 37 4 40 107 



Node 
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Section A. 2 (cont'd) 

Skin involvement 

TO T1 T2 T3 NI 

Node 

No 2 7 38 27 301 

N1 5 32 9 139 

S Stage 

51 1 5 301 

52 1 1 3 136 

53 1 10 62 36 3 

Total 2 12 70 36 440 



Pectoral muscle Disease status Total 
involvement 

TO T1 T3 NI None L M L+M 



7 42 25 301 63 86 7 219 375 

7 21 18 139 51 55 9 70 185 



297 
132 



2 56 43 11 



51 63 5 188 307 
36 40 7 58 141 
27 38 4 43 112 



14 63 43 440 



114 141 16 289 560 



APPENDIX B 



The Fortran maximum likelihood estimation program with 
use of the Newton-Raphson procedure. The version which is listed 
performs the necessary calculations for the estimators of proportional 
hazards model with fixed exponential relative risks. 
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r 



real bl ank . r , rb . i dno, bet a 

common tdeath ( 1000) , i stdvr (40) , i sei vr (40 ) . beta (40) . 
re i4o) , u2cinv (40) . dd (40) « z (40) , zk (40) 

r s idno,n Sjl reprt,irBj f alnll,V.ch,ndatir,,d---n- 
m ° del > itim& > m * h * 1 Status, stp ,ipr, npbase, 1 i kt yp , ' 

nvar sffla}<i tr,alnl0 > ta me9 nca S e,ndsv3,breai,,curtirr;?: m in 

xvtx m e ! ivetat 5 ,ventr ? xvidn C? i £a nvtct.ietep.n,t-r7 v : n 
a :i mentsi an bout (40 , 50 } ******W**1Um 

data bout/2000#0. / 



CCCC define all -file output and input 

□cen ( 5, f i 1 •=■= * phr . r5 r > 

°pen(6,f ale~ r phr.w6' ) 
open (B, f i ls=' phr . r8 p > 
open (9,-f ile=' phr. w9' ) 
open (ll,f ile='phr.wll * ) 
rewind 5 
rewind 6 
rewind S 
rewind ? 
rewind 11 



rr- 
f 



CC read all the data and specification for analysis 



mm (5,903> _ kch , 1 i ktyp , ndat i n , nvt ot , i vst at , i vt i me, i ven*r 

. , i va ana a s , j t a me , mKhalf , i pr , npbase , i vsnp 
if '.m>;hal f . eq . 0) mxhaif = 5 

read (5,908) stp , ti mi n t , di stnc , ch:L en t 

if (stp.eq.0) stp = .001 

if (timint.eq.O) timint = i 

if (distne. eq. 0) distnc = i 

•if (chient.eq.O) chaent = 1.32 

if (ipr.eq.l) read (5. 908) zk 

ncin = 

ncsv = 

ndan = 

ndsvl w o 

ndav2 = 

ndsv" = 

sr = 

10 read <8,903, en d=50) <dd <n) , n=l , ndatin) 
CCC transform any variables if needed 

c 

call transf 

status == dd ( i vstat J 

if (status. eq. 1 ) gD to 20 

ncin = ncin + i *% 
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go to 30 
20 ndin = : ndin + 1 

SO if (irej. eq . I . and. (kch. eq. 1 . or . status, eq. 0. or. 1 i ktyp. eq. 1 ) ) 
I go to 10 

i-f (status, eq. i) go to 40 
ncsv = n csv ■*■ i 
go tc 10 

40 j. f (ire j . eq . 0) ndsvl = ndsvl + 1 
ndsv3 = ndsv3 * % 

if <i vtime. ne. 0) tdeath (ndsvl) = dd (i vtime) 
if (sr . gt . 1 . and . i re j . eq . ) ndsv2 — ndsv2 + i 
i f ( I j. k t y p . eq . . a n d . k c h . eq . ) sr = 
go to 10 
50 nin = no if! # ndin 
nsv = ncsv -»- ndsv2 
ndead = ndsv2 
ncase — nsv 

100 read ( 5 . 903 , en d =800 ) i step, nvar , ns, ibref , maxitr, ireprt , model , 
1 i swi se 

if (i swi se. eq . 0) go to 110 

nvarsv = nvar 

ta&U i s v = ma>: i t r 

maxitr = -i 
1 1 read ( 5 , 904 ) i sel vr 

do 120 n=i .40 

i stdvr (n ) = 

■ 

i f Cn . 1 S . ns 5 i stdvr C n \ = I 
120 continue 

r- 

CGCC insert MLE or use previous steps 
C 

if (ibref . eq. 5 go to 150 

da 130 n=l,40 
130 beta Cn) = bout (n , ibref ) 

go to 300 
150 read (5, 908) beta 

r 

CCC obtain estimate for a constant rate null 
C 

if (li ktyp. eq. 0. or. model . eq. 6) go to 300 
do 200 n=l,nvar 
i f (beta (n) . ne. 0) go to 300 
200 continue 

i-f (model . gt. 1 ) go to 250 

beta(l) = alog (float (ndead) / (ncase — ndead) ) 
go to 300 

250 beta(l) = - float (ncase - 2 * ndead) / (ncase - ndead) 
300 continue 
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rail pnr 

- :CC "se tne PILE either for this sten or 

CCC use the MLE for forward selectman based on large- 
CCC Chi Squared (or use Forced var: abies) 

r 

if ( i swi se . eq . ) goto 700 
nvar = ns +■ 1 
ma>: i tr = max i sv 
ireprt =- 
chiroax = 

da SCO n=nvar,nve.rEv 

if (u2cinv(n! . it. chimax) go to 500 
nmax = n 

chi max = ulrinvin) 
500 continue 

if (chimsx.ge. chient) go to 600 
go to 700 
600 saviv = iselvr (nvar) 

i sel vr (nvar ) == :L sel vr (nmax ) 
x sel vr (nmax ) = savi v 
call phr 

if (nvar.eq.nyarsv! ao to 700 

ns = : nvar 

i st d vr- ( n s) = i 

nvar s= nvarsv 

max i tr = -i 

go to 300 
700 do 750 n=l,40 
750 bout (n, i step) = beta(n) 

go to 100 
800 write £6, 958) 

stop 

903 format (16i5) 

904 format (40i2> 

905 f ormat CSf 1 . ) 

958 format ('normal ending') 
end 

subroutine phr 



CCC perform the MLE cal cul at i on= 

C 

real r , rb , rbb , sr , srb , srbb , i dno , bet nam, unam , bet a. r 1 
real c,ci nv , csave , d 1 , d2 , est i ms , blank , vrnarg , const 
rea] betasv. u, sarat , sratd , shift 

common t death C 1 000 ) .. j stdvr (40 ) , i sel vr (40 ) , beta (40 ) , 

3 rb (40) ,u2cinv(40) ,dd (40) ,2 (40) ,zk (40) . 

2 r , i dno, ns,i r eprt , i re j , al n 1 i , kch , ndati n , d i st nc ., 

4 model , i t i me , mxhal f , status, stp , j pr , npbase, i i ktyp . ' 
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i. nvar , max i tr , al nJ , ti me , ncase, ndead , break , curt in, t:i mi nt ., 

5 i vt i me, i vstat , i ventr , i vidno, is, nvtot , istep, niter, i vznp 

dimensi an u (40 ! , c C 820 > , ci nv (820 ) , csave (820 ) , betasv ( 40 ) , corr <A 
1 srb (40) , srbb (820) , sarat (40) , sratd (40) 

d i mens i an msv ( 1 000 ) , nrsksv < 1000 ) , bcsv ( LOGO ) 

d i men s i on ar ( 40 ) 

ait n 1st -10**10 

nvsq2 = nvar* (nvar+1 ) /2 

niter == 

nsu.rv = ncase — ndead 

a 1 n 1 

if (liktyp.eq. 1) alnlO = ndeadKal og ( f 1 oat (ndead ) ) 

1 + n s ur v * & 1 a g (float Cns u rv) ) 

2 - ncase#al.og (float (ncase) ) 

c 

CCCC after initialising clear arrays and iterate 

r- 

100 m = 
i = 

niter — niter f 1 
alnl = : 
sr = 

da 1 10 n : = i . nvar 

sratd (n) = 

sarat (n) = 
1 10 srb in) =0 

do 120 nn~l,nvsq2 

c ( n n ) = 
120 srbb (nn) - 

break = tdeath(l) - timint/2 

if (i time. eq. 0. and . i ventr . eq. 0) break = 

curt im = tdeatn ( 1 ) 

nphr ~ 

i er = 

tlast = 999999. 

if (ireprt. It . 0) rewind 11 

C 

CCCC get data on each case together with ties if needed 
C 

200 rewind 8 
210 call getr 

nphr = nphr + 1 

if (r.gt.0) go to 250 

if (niter. eq.l) go to S?5 

alnl ~ 

alrl = 

alrO = 

i er = 2 

aa to 510 
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250 i f ( l i k t yp . eq. ) act o ^.0© 

p 

CCCC Ignore the next lnnr -: -f ,,«*«,„ 

n ~ DC - p " * " Sin 9 Partial likelihoods 

f 4, = r ♦ 1 
alni « alril - ai pg (rl 3 

l * (s ta*u S . et ,.i) aim = ainl + alog(r) 
nn ~ 

do 260 n=l.nvar 

if (status, eq.i) sratd(n) = sratd(n) * rb( n w r 
sarat in ) m sarat (n ) + rb i n ) /-- ■ 
do 260 nl=l . n 
nn == nn + i 
260 c(nn) = cinr,) + rb (n) *rb in!) /ir| r - | r n 
if tnphr.lt.ncase) go to 210 
go to 500 



C 

cccc 

300 



BS3in PartiSi Hklih - d estimation tor survival ti mes 



r = sr r 

n n = 

do 310 n =l,nvar 
srbt'n) =-- srb (n) + rbCn) 
do 310 nl=l, n 
nn » nn + i 
310 srbb(nn) = srbbCnn; + rb (n) *rb (nl)/r 
•'if (status, eq . 0) go tc 210 

if (kch. ne.0. and. time. ne. tdeath(i+i) ) go tc 210 



C 

cccc 



ignore censored +-imoc- -~ 

ft. ..j. mes ana adjust deaths 



i = i+l 

if (ireprt.ge. i . and . n i ter . eg . l) 
C 1 WrltS C? '" ?) i..nphr, irej ,ti me ,r,<rb<n),n = l !1 nvar> 

CCCC if death has missing data , skip calculation 

if (irej.eq. I ) go to 455 
alni = aim . f alag(r) 
nn = 

do 320 n*i , nvar 
320 £f "atd(n) = sratd(n) + rb<n) / r 

CCC loop to 

CCC perform calculations for firs* art( j . 

C first and second derivative 



if Ckch.eo.O.or.i.ge.ndead) ao to ^50 

i f t tdeath (i) .eq. tdeath (i +1 ) ) go to 21 C 



/ 
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350 nn = 

if '. i reprt . ge. 0) go to 361 
dt = tlast - tdeathCi) 
-I &ftt - t death Ci ."' 
alantO = m/sr 

vl attiO = m# ( nphr — m ) / ( nphr *sr *sr ) 
do 360 n=i,n var 

360 ar (n) ■ srb !n)/sr 

write ai ' 97i> t death <i > , tit , m, nphr , al amO, vl amO, iarfn! ,n=l.nvar) 
971 -format C2f 10. 3, 2i5. 41el5. 6) 

361 continue 
ins v ( 5 ) = fr. 
nrsksv ( i ) = nphr 
srsv C| > = : sr 

alnl * alnl - m#alog(sr) 
i 4 (ni ter . eq. 1 ) 
1 alniO = alnlO - m#al og (f 1 oat (nphr ) ) 
d o 4 00 n = i , n v ar 

ear at (n) = saratin) + m#srb (n ) /sr 
do 400 n 1— 1 , n 
nn ~ nn * 1 

400 c (nn) * c (nn) + m * (srbb(nn)Zsr - srb <n) JKsrb (ni!/ <sr*sr) ) 

i f ti reprt . ge . a . and . ni ter . eq . 1 J 
1 write (9.998) sr , ( srb (n ) , n=l . n var ) 

i-f Ci reprt . gs. i . and. ni ter . ea . 1 > write f9,99ES) 
455 m = 

P 

CCCC Now for all oases a contribution to liklihood has been made 
CCCC i-f using ordinary partial likelihood the next few lines 
CCCC are nor needed 

r 

if (i . ge. ndead) go to 500 
if (kch.eq.0) go to 465 
curtim = tdeath(i+l) 

i f C t death Ca + 1) . ge . break ) gc to 2 1 
460 break = break - timint 

i f £ tdeath + .It. break 3 qo to 460 
465 nphr = 

sr M 

d O 470 n = I , n var 
470 srb(n) ~ 

do 4S0 nn=l,nvsq2 
4B0 srbb(nn) = 

if ( i s. eq . 1 . and . ( i step . ne. 1 . or . ni ter . ne. 1 ) > go to 210 

if (kch.ne.0) go to 200 

go to 210 

C 

CCCC If likelihood is low enough take an estimate for this iteration 
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500 it (alnl -alnlst. gt. -stp) go tc 540 

i er = 1 

air© = 2 * <ainl - al n 2 ) 
air I = 2 * (al nl - alnli ) 
5 10 do 52 n = 1 . n v a r 

520 beta(n) - (beta(n! * betasvin; ) /' 2.. 

it tr :r: 

rw - 

n shift = n shi ft * 1 
niter ~ '.liter - 1 

write (6, 906) ni ter , nsh i f t , i. er , shi -f t , al n 1 . al r0, al r 1 , rr « rw 
write (6,91.0) (beta (n ) , n = 1 , nvar ) 

if (ni ter . ge. max i tr . or . nshif t . ge. mxhalf" ) go to 800 

go to 100 
540 n shi ft = 

aldi-ff ~ alnl - alnlst 

alnlst = alnl 

do 570 n=l,nvar 
570 u in) = eratd(n) - sarat(n) 

u 

CCC Output results of -Firs iteration, do a test -Fcr each variable, 

if (niter. gt . 1 > go to 650 
al nl 1 = alnl 
write (6,902) 
do 630 n=i,nvar 
nn = 

do 620 ni=3.nvar 
do 620 n2=l.nl 
nn ~ nn + 1 
c save ( nn ) = c inn) 

i-f (nl . ne. n . and. i stdvr (n 1 ) . eq. 0) go to 610 

i f (r,2.ne . n . and . i stdvr Cn25 . eq . ) go to 610 

go to 620 
610 csave(nn) = : * 

if (nl.eq.n2) csave(nn) = 1. 
620 continue 
CC invert matrix 

zail linvlp (csave, nvar , cinv, 1 , dl , d2, ier ) 

nn ~ n# (n+ 1 ) /2 

u.2cinv(n) = u(n) * u(n) * cinv(nn) 
CCC calculate the chi value 

call mdch (u2ci nv (n ) . 1 . . psig , i er ) 

psi g = 1 . - psi g 

if ( i se 1 vr ( n ) . eq . ) got o 624 

if (ipr . eq . 1 ) go to 625 

write (6, 904) n, iselvr (n) , istdvr (n) , 

1 beta (n ) , sratd (n ) , sarat (n) , y Cn) , c (nn ) , ci nv (nn ) , u2cinv (n J , pel g 
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gc to 630 

624 wri te (6, 9045 n , i sel vr £fl ) . i stdvr (n ) 

i seta Cn ) , srstd (n) , sarat in ) , u Cn ) , c (nn J . ci nv inn ) , u2ci nv En) , ps:L g 
go tD 630 

625 writs (6. 929) n. isel vr <n) , zk <n) . i stdvr <n) , 

1 beta CriJ , srstd (n ) , sarat. In ) . u Cn ) , e (nn ) , ci nv (nn ) , u.2ci nv Cn ) ,psig 
630 c on tin ue 

C 

C-CCC get correlations and information matrix ready 

if (ipr.eq.l) go to 646 
write (6,903) 

i f C i sel vr < 1 ) . eq . ) go to 634 
go to 636 

634 if invar.ge.2! go to 635 
gc to 636 

635 continue 

636 nil - 
nil m o 

da 645 nl=l.nvar 
nil = nil ■+* nl 
n22 = 
do 640 n2=l,nl 
n22 = n22 + n2 
n 1 2 = n 1 2 + 1 
640 corr <n2J = c <nl2) t sqrt (c (nl I ) #c (n22> ) 
i f C | sel vr (nl) . eq . J go to 644 
write (6, 901 ) (corr (n2) , n2=l ,nl) 
go to 645 

644 write (6.901) (corr <n2) , n2=l , n 1 ) 

645 if (nvar.gt.S5 write (6,901) 

646 write (6,901) 
write (6,913) alnlO 

r 

CCC Invert the information matrix and check within range 

650 if (max i tr . 1 t . 0) go to 890 

da 670 nn=i,nvsq2 
670 csavecnn) = c (nn) 

if (ipr.eq.0) go ta 690 

n n = 

do 6B5 n=l , nvar 
do 685 nl=l,n 
nn = nn * 1 

if (u (n) . ge. 0. and. u (nl ) . ge. 0) go to 685 
if (u (n ) . 1 1 . . and . beta (n > . 1 e . ) go to 675 
if (u (nl ). 1 1 . 0. and . beta (nl ). 1 e. 0) go to 675 
go to 6S5 
675 if in.eq.nl) go to 680 
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c (nn ) = 
go to 685 
680 c(nn) ~ 1 

685 continue 
nsl ~ n s 

cio iiS6 n=i.nvar 

: I Co <n) . ge. 0. or . bets. <n) . gt . 0) go to 686 

u ( n ) = 

nsl = nsl + 1 

686 continue 

690 do 6905 nn=l.nvsq2 
6905 csave(nn) = c(nn) 
CC invert matrix 

cal 1 1 i n v 1 p ( c . n var , c i n v , 1 , d 1 , d 2 , i er ) 

if Ci er . eq . ) go to 69 1 

i er =3 

rw : = 

r r = 

alrO = 2 # (alni - alnlO) 
air 1 = 2 * UUOI - alnll ) 
niter — maxitr 
go to 770 

r 

CCC ignore the following for survival analysis with single loop?; 

a,91 rr = 
rw w 
shift = 1 
nnsv ss i 
do 692 n=l,nvar 
692 betasv(n) = beta (n 5 
do 7 n = 1 . n v a r 
nnsv — nnsv + n - 2. 
nn — nnsv — 1 
do 695 nl=l,r;var 
nn = nn + 1 

if (nl.gt.n) nn = nn + nl - 2 

rr = rr + u (n ) *y (n 1 ) *c:i nv (nn ) 

rw = rw + betasv (n 5 *'betasv (n 1 ) tfcsave (nn ) 

beta Cn) = beta in) + u (n 1 > *ci nv (nn ) *di stnc 
695 continue 

if ( ipr . eq . 0. or . bet a (n ) . ge. 0) go to 700 

i f (betasv ( n ) . gt . ) 
1 shift = ami n 1 ( sh i f t , bet asv (n ) / { bet asv (n ) — bet a ( n ) 3 ) 

if (betasv (n ). eq . ) beta in! * 
700 continue 

if (niter. eq.l) rrO = rr 

i f (sh:L f t . eq .^1 ) go to 750 

do 720 n=l , nvar 
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beta <n) = bet a (n )# shi ft * betasv (n )#< 1 -shift 
720 if (beta(n) .. 2 t . betasv <n ) *sto > beta Crt) =-• 
750 shift * shift*distnc 

if (niter .gt. 1) go tc 77C 

write <6, 905) 

if (apr.eq. 1) go to 760 

if (iselvr (1) „eq.0) go to 75S 

go to 765 
75S if (nvar.gs.2! go to 75? 

go to 765 
75? continue 

go to 765 
760 write (6,9*3) (z k (n ) , n = l , nvar J 
765 write (6,901) 
770 alrO = 2 * (alnl - alnlO) 

air 1 m 2 * (al-nl - alnU i 

write (6, 906) ni ter , nshi f t , i er p shi f t , ai nl , &l rO, air 1 , rr , rw 
write (6,910) (beta (n ), n- 1 , nvar j 
write (6,910) ( u { n ) . n=l , n var ) 
write (6,9015 



CCCC stop all loop- 

c 

i f ( ma;-; i t r . eq . ) got o S70 
i f ( n i t er . ge . ma;< i tr ) go to 800 
if (rr. ge.stp) go to 100 

if (aldif f .gt. (stp * 10 . ) . and . shi f t . eq . 1 ) go to 100 

CCC finaJ est i mat ore get SE and MLE for relative risk 

p 

300 if (ipr.eq.0) go to S10 
write (6,927) 
r = 

if (npbase. ne. 0) r = - beta(npbase) 
n = 
var = 

if (npbase. ne. 0) var = ci nv (npbase* (npbase+1 ) /2) 

expr — exp (r ) 
sebeta = sqrt(var) 
expse = exp ( -sebeta > 
chisq = 

if (var.ne.0) chisq = r*r /var 
;:: = 

wri te (6, ?2B) n , zO, r , sebeta, expr , expse, chi sq 

var = 

do 808 n=l,nvar 

f = r + beta <n ) 

if (beta (n ) . eq . 0) go to SOS 

if (n . eq. npbase) go to 807 
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var = var * cmv (n# (ri+J > /25 
nni m n* Cn~l ) /2 * i 
nn2 = n # (n+i ) /2 - 1 
ni = 

do £305 nn=nni ,nn2 
nl = nl + I 

805 if (betainl) .ne.O. and. npbase.ne.nl) var = var * 2*cinv(nn> 

807 se = sqrt ivar) 
chisq = 

if (se. ne.O) chisq = (r/se)*#2 
expr - exp (r! 
expse= exp < se) 

sebeta = sqrt (ci nv (n* (n+1 ) /2) ) 

write (6, 928) n,zk <n) .beta(n) , sebeta, expr, e>:pse, chisq 

808 continue 
go to 862 

CCC in order to use the following change the above liner 

CCC the following will use estimates far removal of a variable 

C 

S10 write (6.907) 

if Ci sei vr C 1 ) . eq . > go to 814 
q c to SI 6 

814 if C n var . g e . 2 ) go to 81 5 
go to 816 

815 continue 

S16 write (6,910) (beta(n) ,n=l,nvar) 

write (6,915) 

do 820 n=l,nvar 
820 csave (n ) = 1 /csave Cn* (n + 1 ) /2 ) 

write (6,910) ( csave (n ), n=l , nvar > 

write (6,916? 

do 850 n=l,nvar 

nnl = n* (n-1 ) /2 + 1 

nn2 = n* (n+1 ) /2 

i f ( i sel vr ( n ) . eq . ) goto 825 

write (6,910) (ci nv (nn ) , nn=nn 1 , nn2) 

go to 850 

S25 write (6,910) ( c i n v ( nn ) , nn=nn 1 , nn2) 
850 if Cnvar.gt.8) write (6,901) 

do 860 n=l,nvar 
860 u2cinv(n) = beta(n) * beta.(n) / c i n v < n * ( n+1 ) /2) 

write (6,911) 

write (6,913) (u2ci nv (n ) , n=l , nvar ) 

rwO - 
n n - 

nsl = ns + I 
ndf = nvar - ns 
do 8600 n=nsi,nvar 



456 



nnO - n#(n-i)/2 ■* ns 
cl o S 6 00 n 1 = n S 1 . n 
n pi = nn 1 
nnO " nnO * 1 
3600 cinv(nn) = cinv(nnO) 
CC invert matrix 

call 1 invip icinv, nd-f , c, i , dl , d2, ier ) 
nnsv ::: i 

do 8610 n=nsl,nvar 

nnsv - nnsv ->- n - ns - 1 

nn — nnsv — i 

d a 861 n i =n s 1 , n var 

nn = nn + ! 

i-f (nl.gt.n) nn = nn + nl — ns — 2 
S610 rwO m rwO + beta Cn ) *beta (n 1 ! #c (nn > 

862 i-f (nshi 1 1 . ne. 0. or . ni ter . ge. max i tr ) write (6,914) 
air Chi 2. * (alnl-alnlO) 
at = nvar 

if (ipr.eq.l) dt = d-f - nsl 

C C calc U late t h e chi-s va I ue 

call mdch (ai rchi , d-f , psi g , i er > 
psig == i. - psig 
write (6, 920) 

wr i t e ( 6 , 90S ) al rchi , d-f , psi g 
CC calculate the chi = value 

call mdch (rw, d-f , psi g, ier ) 

psig = 1. - psig 

write (6,924) rw, d-f, psig 

alrchi = 2*' (ainl-alnl 1 ) 

df " nvar— ns 
CC calculate the chi g value 

call mdch (al rchi , d-f , psi g , i er ) 

psig = 1. - psig 

write (6,921) 

write (6, 908) al r ch i , d-f , psi g 
870 d-f « nvar-ns 
CC calculate the cni s value 

call mdch (rrO, d-f , psi g , i er ) 

psig = 1. - psig 

write (6,912) rrO, d-f, psig 
CC calculate the chi s value 

call mdch (rwO, d-f , psi g, 1 er ) 

psig = 1, - psig 

write (6,924) rwO, d-f , psi g 

i-f (max i tr . ge, 0) go to 881 

da 880 n=l,nvar 
880 b et a ( n ) = b et as v ( n ) 

r etu rn 

381 i-f (kch.eq.O) return 
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write (6,930? 

chO = O 

chl =0 

skmO = 1 

skrnl " 1 

j - ndeac! + 1 

do 885 i=l . ndead 

if (j.eq.ndead) go to 882 
i f (tdeath ( j ) . eq . tdeath ( j* 1 ) ) go to 885 
SS2 alamO = msv < j ) / srsv < j ) 

al ami - msv(j) t -float (nrsksv(j) • 

r bar = srsv ( j ) / nrsksv ( j ) 

chO = chO + alamO 

chl = ohl * a I ami 

skmO = skmO * (1-alaniO) 

sterol = : skml # (1-alaml) 

sbresO - exp C-chO) 

sores! = exp (-chl) 

write 06, 931 ) tdeath ( j) , nrsksv ( j > , msv { ■ ) , al ami , ch 1 , skml , sbres 1 , 
1 rbar , al amO , chO , skmO , sbresO 

885 continue 
890 return 

895 write <6, 922) nphr , i , key2 , S dd ( 1 ) , I = 1 , ndat i n ) 
write (6,923) r , rb f .n ) , (beta (n! , n = i , nvar > 
step 

90 1 format ( 4x . m 1 5 . 5 / ( 1 3x , 8-f 1 5 . 5 ) ) 

902 -format C ' variable bets'., 

i r observed expected u e(i> , 

2' tm I i ) ) chi sq P ' / ) 

903 I or mat (-'correlation matrix 7 5 

904 -f or mat C i 6 , i 7 , 6x , i 8 , 6e 1 3 . 5 , -f 9., 3 , -F 1 1 . 6 ) 

905 -format ('-iteration outputs ;'/' iteration increment.-", 

i > error di-f s log I r test ' 

2' Ir test one two'/ 

2' number halving codes multiplier likelihood 

3, ' hO:beta=0 hO: bet a=betaO hO:beta=mle ', 

4' hO:beta=0'/) 



906 


■format 


(3iil,6-f 15.6) 




907' 


format 


(.' MLE of beta'/) 




908 


format 


CLR chi square*' , f 12. 6, f 4. i , ' 


Hf 


910 


■format 


(4x,3el5.6/ (13x,8el5.6) ! 




911 


format 


( ' tests to remove' ) 




912 


format 


('Chi square**' , -f 12 . 6 , -f4 . , 'd-f , 


P -~ 


913 


for mat- 


U3x,8f 15.3) 




914 


far mat 


( ' no convergance' ) 




915 


format 


( ' var i ances , wi th Asym normal it 


r > 


916 


format 


( ' temps ' > 




918 


-format 


('ln(L) null hypothesis 88 ' , f 13. 


4) 
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920 format ( ' si ana § i cance 

921 format ('adds') 

-22 format CT'initia! estimates of beta produce zero or negative', 
I* r for subject ', i5, ' of risk set ' , a 5 . ' with id ~ ' , aB/ 
2' raw data for subject f el lows' /10-f 13. 6/10f 13. 6) 

923 format ('computed r ~ ' ,©16.5/' n rb(r;)'. 
1' beta(n) ' / ( i 5, 2>: , 2ei5. 5) > 

924 format ('test cm square=' „ f 12. 6, f 4. 0, ' df , p = ' , f 9. 6) 
927 format ('-final estimate of the relative risk', 

1' function and its asymptotic standard error'/' increment 

2' 2 beta se(beta) r >; se(r> cum. rw test') 

925 format (a 7, f 10. 3, 2f 10. 6, 3f 1 1 . 4) 

929 format ( i 6. % 7. 6:: , f 6. 3. i 8, 6el3. 5, f 9. 3, f 1 J . 6) 

930 format (' 1 estimated survival functions at the mi e of beta'/ 

I'O survival num num entire set - unadjusted for covar 

2iates' , 10k, 'mean' ,8k, ' null functions - a.e. evaluated at z = 0'/ 

time at dead hazard cumulative survival func 

4tions' , 10k, 'risk' ,8k, 'hazard cumulative survival functions'/ 
5 * fi%k rate hazard cox 

6 ',10k,' ' ,By.,' rate hazard cay. '/) 

931 format CfS. 1.17, i 5, 4:.-: , 2f 12. 7. 2f 1 1 . 5. f 13. 4, 4 k , 2f 12. 7, 2f 1 1 . 5) 

998 format (38k , f 10. 4, 6k , 6el3. 5/ (3k , 10el3. 5) ) 

999 format (' ' . i 4 . i 5. 2m , i 3 , f 6. 2, 7 v. , f 1 0. 4 , 6>: , 6el 3. 5/ 
1 (3k , 10el3. 5) ) 

end 

sub r out i n e g et r 

real b 1 a n k , r , r b . i d n o , b e t a 

common tdeath ( 1000) , i stdvr (40) , i sel vr (40) , beta (40) , 

3 rb (40) ,u2cinv(40) , dd (40) ,2 (40) ,zk (40) , 

2 r . idno, ns, ireprt , irej, alrtl 1 , kch , ndat in , di stnc , 

4 model ,iti me , mxhal f , status » stp , i pr , npbase , 1 i ktyp , 

1 n var . ma;: i t r , a 1 n 1 , t i me , n c ase , n d ead , b r ea k , c ur t i m , t i m i n t , 

5 ivtime, i vstat, iventr , i vidno, is, nvtot, istep, niter, ivznp 
if (i s . eg. 1 . and. (i step . ne. 1 . or ,ni ter . ne. 1 ) ) go to 150 

time - 

i dno = 

ten try « 
100 read (8,903) (dd ( n J , n=l , ndat i n ) 
903 format (1615) 

call transf 

status =-• dd(i vstat) 

i f (irej . eq . 1 . and. ( kch . eq. 1 . or . status. eq . 0. or . 1 i ktyp. eq. 1 ) ) 

1 go to 100 

if (i vti me. ne. 0) time = dd (ivtime) 
if (i vidna.ne.0) idno = : del (ividna) 
if ( i ventr . ne. 0) tentry = dd ( iventr) 
if (tentry. gt.curtim. and. kch. eq. 1 ) go tc 100 

150 continue 

151 do 200 n=l. nvar 
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if ( i sel vr (n) „,@q.O) gc to 1 90 

z(n) = dd c i sel vr (n ) ) 

go to 200 
190 z Cn) = 1 
200 continue 

i-f <ipr.eq.i) go to 700 

go to (300 j 400 , 500 , 600 , 600 , 650 ) , model 

go to 899 
300 betas = 

do 310 n=i ,nvar 
310 botaz = : beataz * beta Cn > Cr<> 

r — e>;p (betaz j 

da 320 n=i,nvar 
320 rb in) = r#z (n) 

go to 800 
400 r % % 

do 410 n~l,nvar 

r = r * betain)*z (n! 
410 rb Cn) = z Cn} 

go to S00 
500 r ~ i 

do 510 n=l,nvar 

ebeta = exp (beta CnX ) 

r = r * ebeta#z Cn > 
510 rbCn; = ebeta*z Cn) 

go to 800 
600 cal 1 comb 

go to 800 
650 call mdlsub 

go to 800 
700 r = 0. 

710 kz * kz + i 

it Cz Civznp) . It.zk (kz) -or. kz.gt.nvar) go to 720 

r = r + betaCkz) 

go to 710 
720 r = e>:p Cr ) 

kzml = kz - 1 

do 725 n=l . kzml 
725 rb (n) = r 

i-f (kz.gt.nvar) go to 740 

do 730 n=kz,nvar 
730 rb (n) = 
740 continue; 
800 return 
399 conti nue 
999 continue 

stop 

end 
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300 
400 



450 
460 
500 



530 
550 



570 



580 



subrouti ne comb 

re&1 blank , r . rb , g dna .beta 

.„=° mmon tdeath < 1000) ■ istdvr (40) , iseivr (40) . beta (40 > , 
* rb <4o) ,u2cinv<40> ,dd<40), 2 (40) ,zk<40> , 

- f idno,n S> ireprt fJ rej,alnll.,kch,ndatin.d 1S tn- 
7 m ° dei ' " - me » - . status , st P , i pr , npbase , 1 i ktyp" 

5 i vt»«, a vstat , ivmtr, ividno, is, nvtot, .step, niter, iy 2np 

dimension rab (40) , rmb (40) ' *■ p 

nml nvar - 1 
b = beta (nvar) 
2 f (nvar . gt . ns ) go to 30O 
n in 1 — ns 

go to (S00, 800, 800, 400, 500) .model 

go to 800 

radd = l 

rmult = i 

do 450 n=l,nml 

radd * radd •+■ beta <n > #s <n > 

r ab ( n) = z in ) 

rmult * rmu.lt * (i + beta (n) *z ( n) ) 
do 460 n=l,nml 

rmb(n) « 2 (n ) *rmul t/ ( 1+beta (n ) *s (n) > 

go to 700 

ndufl! = 

do 550 n=i,nml 

rab Cm) » o 

r mb ( n ) = 

if (2(n).eq.0) go to 550 
go to (530,550) ,ndum 

it * n 

ndum = l 
go to 550 

ndum = 2 
continue 

go to (570,590) ,nduffl 
radd = 1 
rmult = 1 
go to 700 

radd = 1 + beta €i 15 
rmult = radd 
rab (i :L ) sa i 
r mb ( i i ) = i 
go to 700 

radd ft 1 + beta(il) + beta (i 2) 
rmult a radd * beta ( i 1 ) *beta ( i2) 
r aD ( i 1 ) = i 
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rata (12) = i 

rmb(il) — 1 * bets'. 12) 

rmbC:L2) = 1 + Qeta!il; 
00 bml = 1 - b 

r = ratio i* bml I rmult ** b 

rb(nvar) -- r * (alog (rmult/ - a I og (radd) ) 

do 750 n = 1 g nml 
50 rb (n) = r # (bml # rabin) / radd + b * rmb(n) / rmult) 
BOO return 

end 

sub r out 1 n e md 1 sub 
r e d 1 r , r b . b e t a 

common jl (1160) , beta (40) . rb (40) , j2 (100) ,2 (40) , j3<40> , r , j4< 17) , 
1 nvar , j5 ( 17 5 

return 
end 

■::c this the same subroutine as transt with dummy 

subroutine drtans 

common jl (1360) , dd (40) , j2(106) , ire,, j3(19) .curtim, j4(10) 
irej * 
r stum 
end 



subroutine trans-f 

common j 1 (1360) , dd (40) . j2 < 106) , irej , j" ( 19) , curtim, j4 (10) 
do 100 n-10, 16 
00 dd(n)=0 
irej = 
r sturn 
end 
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